Agentic Coding and Mental Models

Agentic coding and mental modelsai Agentic coding and mental models 11 June 2026 I reckon I’ve drafted and then deleted a version of this post at least 10 times in the last 12 months. Deleted because it falls in the category “I must be wrong about this as everyone else is saying the opposite”. But this week’s release of Fable, and especially the reasons people are saying it’s such an improvement, are the nudge I needed to finally publish. So, here goes: I think everyone is wrong about how to write code with LLMs. Or at least, I think they’re wrong about how I should write code with LLMs. The reason is to do with mental models. When you write code, the code is not the only artifact that’s generated. You also construct a mental model of how the code works, its runtime behaviour under different conditions, how it fails and so on. Outside of toy projects, this mental model is rarely perfect. Improvement and maintenance of the mental model continues for at least as long as improvement and maintenance of the code itself. Mental models also have an alarming tendency that code doesn’t; they degrade rapidly whenever you’re not thinking about them and the cost of reconstructing them increases the longer you continue not to think about them. They’re wriggly little buggers, mental models. Now a few times in my career, I’ve been fortunate enough to work with bona fide programming geniuses. They come in various forms but a common thread is their ability to construct mental models rapidly and accurately, then apply them successfully on a complex project. Unfortunately, I’m not one of these geniuses. I’m a regular engineer, somewhere in the middle of the bell curve of programming ability. So for me, every mental model is the result of hard struggle. Reading lines of code, observing behaviour, printing runtime state, using a debugger, occasionally resorting to trial-and-error changes just to see what happens. In this way, gradually, I inch myself closer to sufficient understanding that I can make changes without breaking stuff most of the time (and yet, still stuff breaks 🤔). I value my mental models highly. To me they hold greater value than the code itself, which might sound crazy to some people I guess. A mental model is a delicate flower that must be cultivated with care and protected against trampling. Oh, look! Here come the LLMs to trample all over my mental model. At this point I should make clear, I’m not anti-LLM per se. I use them as my daily driver at work and on numerous side projects in my spare time. Claude even fixed a 5-year old memory leak that I’m not ashamed to admit I was unable to fix on my own. So I’m sold on the technology. But I’m not sold on how I’m being told to use it. Everything I read encourages me to turn the automation dial up to 11. If I don’t let agents work autonomously, I won’t get their full benefit. Actually I should have swarms of agents running in parallel, helping me to ship more features at once. I hear at some workplaces there are usage leaderboards and the assumption seems to be that token spend directly correlates with business value. I find this reasoning insane. Let’s detour briefly to talk about code reviews. It’s quite well understood, I think, that big code reviews are harder than small code reviews. As responsible engineers, we try to separate large changesets into smaller independent units, to help our colleagues review them better and so we receive better feedback as a consequence. Give me a 200-line code review in a familiar codebase and I’ll feel confident about understanding its impact. At that size, it’s easy to update my mental model and assimilate whatever the change does (or what it tries to do). Then I might see some tradeoffs in the approach; there could be performance concerns, lurking footguns or perhaps an existing abstraction could be re-used to make the change more cohesive. Make it a 1k-line code review instead and things take a little longer, but the size is still within reason. Increase it to 2k lines and I’m setting aside a solid block of time and giving it multiple passes. At 5k lines we’ve probably exceeded my capacity to meaningfully review the change unless it’s broken into smaller chunks. You see the pattern. Increasing autonomy for coding agents has precisely the same effect as increasing the size of code reviews. It makes my job harder, slows down development of my mental model and decreases confidence that I understand what’s going on. Automation evangelists might tell me to let go at this point; automate code reviews, automate verification, automate bug fixes, let the machines do all the work. I’m sure this works for some people, geniuses to the right of the bell curve perhaps, but it doesn’t work for me because I’m left without a working mental model. At that...

Agentic Coding and Mental Models

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

Apple WWDC 2026 Livestream

Claude Fable 5

It's Not Just X. It's Y

Show HN: GoPeek – open links in live mini browser windows without new tabs