Using encapsulated development to code on my phone
Mary Rose Cook
About
Blog
All posts
RSS
I’m lucky enough to have a wife, two young children and a job as an AI Engineer at Notion. I’m also lucky enough to have a side project.
Some weeks, I’ll get half an hour on my laptop to work on this side project. Most weeks, I won’t. Yet, I’ve averaged two commits to the project per day for the last three months.
How? By building on my phone. But, really, by making it possible to build on my phone.
The broad approach: encapsulated development. Each prompt leaves the repo in a known, stable, verified, stateless condition that permits the next change.
Known
My mental model of the project is functional enough that I can make future changes. This doesn’t mean I understand everything. I may not know the architecture of some parts of the system. I may not know how some implementations work. But I can understand the behavior of those parts of the system as black boxes. For example, I don’t know exactly how the particle system stores particles. But I do know that, when it spawns particles, it doesn’t allocate new objects.
Stable
The quality of the software must be high enough for the next change to be successful. This means well factored, working code.
Sometimes, the project will start to crumble with the accumulation of low quality code. I do a string of refactors to get things back on track, then start building features again.
Verified
Every change must be correct. I don’t have time to manually test changes. The kettle has just finished boiling.
Incorrect changes build into a wobbly tower. Each prompt must include how it will be verified as correct. This usually means end-to-end tests.
For a web app, this might mean having Chrome DevTools actually click through the UI to check it works. Or spinning up the API and checking that inputs produce the expected outputs. Or going full ~StrongDM~ and implementing the external services the product interacts with.
In my case, I’m working on a tool for making video games. So I built a headless version of my engine. Test code can initialize it with a set of game objects, supply synthetic user inputs, then view screenshots to verify the behavior.
Stateless
Every change should go uninterrupted from prompt to production. At the end of my session - probably 30 seconds or a minute - the project is in a new, known, stable state.
It’s somewhat acceptable to set aside a prompt partway through writing it. To resume, all I need to do is read the text. Quite bad is having to set aside a PR. When I return I have to figure out what still needs to be built, what still needs to be fixed. On a phone. This is difficult because a phone has a low input/output bandwidth. It’s effortful to draw together the state of an artifact. And effortful to edit the artifact at multiple points.
Worst of all is letting the project go off the rails. About two months ago, my project was in a pickle. I’d charged forwards at the beginning, and, now, I had a web app written in plain JS, HTML and raw Node. I knew I wanted to refactor it to TypeScript and Next. And I knew it would be too high a mental load to do that refactor across dozens of commits on my phone. I needed my laptop to be able to research the code, write the plan and have enough visibility to steward the refactor. So, one night, I stayed up way too late and got the project back on the rails.
The worst part about these stateful, interrupted sessions is that, the more chaotic the state the project is in, the fewer opportunities I have to make the next bit of progress. If I’m in a known, stable state, I only need to dictate a prompt. And I can do that while I wait for the MUNI. But, if I’m halfway through a PR, I’ll need to do some reading and scrolling and I might have got nothing done when the K arrives and wrecks my train of thought.
Oral literacy
Many of these measures are useful for projects not built on a phone. A well-understood, stable architecture. Robust verification of correctness. Stateless changes. But, on the phone, something is lost. It’s good to plan. It’s good to design.
It’s like an oral culture. If I want to put together a more complex feature, I have to keep it aloft in my brain. Maybe jot down a few notes in Bear. But, no sketchbook, no Figma, no design doc. No divergence, no exploration. This hampers the quality of what I build.
Still, it feels silly to focus too much on this loss. Without LLMs and phones, the project wouldn’t have existed in the first place.
Specifics
I focused this essay on the traits that won’t change. But, in case it’s useful, here’s my setup as of May 25, 2026.
I issue a prompt in the Codex iOS app to GPT 5.5 extra high. The Codex harness makes a commit and the Codex cloud env runs the tests. I push a PR from my phone and a GitHub Action automatically merges it. Another GitHub Action detects the merge and deploys the code to my Digital Ocean VPS and restarts the server. Done.
Subscribe...