I have not written a line of code in five months | Gerard Rodes
I have not written a line of code in five months
2026-06-04
AI
SOFTWARE-ENGINEERING
AGENTS
This article has been generated by GPT-5.5 from the transcript of a presentation I gave on June 4, 2026, then reviewed and edited to keep it close to my own voice.
During the last five months, I think I have barely written a line of code by hand, and I say this for good and for bad, because using AI for programming now feels obviously faster in many cases, to the point where manual coding can feel a bit absurd sometimes. You explain the intention, guide the architecture, review the output, ask for changes, ask for tests, ask for a refactor, and suddenly something that would have taken days starts taking hours.
At the same time, the engineering work has not disappeared; it has moved somewhere else. You are still responsible for the code. You still need architecture, tests, taste, and the ability to understand what is being produced. I would even say all of that matters more now, because generating a lot of code very quickly also means generating a lot of garbage very quickly if you don't know how to guide the process.
I am not trying to write a grand theory of AI here, only my honest view after using LLMs seriously for work projects and personal projects, and after pushing them further than I expected to push them.
Smart and dumb at the same time
The most frustrating part for me is how unstable the experience feels. You can ask a model to reason about a complex codebase, propose a structure, connect ideas that you had not connected, and sometimes it will one-shot something better than what you had in mind. Five minutes later, you ask it to do one tiny deterministic thing and it fails in the stupidest possible way.
I am not exaggerating here; for me, this has become the normal experience. It feels like working with something that can be one or two levels smarter than you in one moment, and completely unable to follow a simple instruction in the next one. It can design a decent architecture for a service, and then get stuck doing workaround after workaround for a small bug that you would have fixed manually in ten minutes.
For that reason, I don't like the simple question of whether AI is good or bad for coding; the answer depends too much on which part of the work you are looking at. In some dimensions it is very good, and in others it is absurdly bad. If you want to use it seriously, you need to learn where those borders are.
Prompting is not programming
One of the things I built during this period was an incident investigation bot. The idea was simple: given an alert, the bot should inspect metrics, logs, traces, dashboards, and downstream dependencies until it reaches a useful hypothesis about the root cause.
My main problem with LLMs in this kind of task is laziness. They tend to stop at the first plausible explanation. If an endpoint is failing because a downstream dependency is failing, the model will often say: "the downstream dependency is failing", and stop there. The problem is that a downstream failure is usually only the next node in the investigation, not the actual root cause.
A human keeps going to the downstream service, checks its logs, checks its traces, checks whether it has another downstream dependency, and keeps following the error until reaching a dead end, or at least until reaching the deepest useful explanation.
So I tried to encode that into the prompt. I described the investigation as a graph. Services are nodes. Signals are edges. A log, a metric spike, a span error, a dependency failure: all of those are edges that let you move to another node. The bot has to keep traversing the graph until the path stops being useful.
The bot got much better after that, although new problems appeared immediately. Sometimes it would over-traverse. Sometimes it would treat the deepest branch as the root cause, even when another branch was more directly related to the user impact. Sometimes fixing one failure mode in the prompt would break another one.
The strange thing with prompts is that they are not code and they are not deterministic. You cannot write a prompt the same way you write a function and expect the same input to always produce the same behavior. You are not programming a machine in the classical sense. You are shaping the behavior of a probability machine, and the more instructions you add, the more strange interactions you can create.
I don't think this problem is solved. If it were solved, prompt injection and jailbreaks would not exist in the way they exist today. A longer prompt is not necessarily better. Often, what you really want is a prompt that communicates the final objective clearly enough for the model to keep rediscovering the right behavior by itself. Finding that sentence is very hard.
Context is everything
The next thing I learned, or maybe relearned, was that context matters more than the...