I'm a product manager. My code merges without an engineer reading the diff — Next Wave of Tech
▶ NOW PLAYING · CH 03<br>Plans not PRs<br>Demo Reels · JUN 10, 2026
▶ CH 03
TAPE №002<br>Demo Reels·JUN 10, 2026·9 MIN<br>I'm a product manager. My code merges without an engineer reading the diff<br>We moved review off the diff entirely: an engineer reviews the plan, agents check the code matches it, and if it does, it merges. Nobody reads the diff.
If your team uses coding agents, you've probably watched an engineer scroll through a 2,000-line pull request that an agent wrote in four minutes, wondering what "reviewing" it is even supposed to mean. Every team that's adopted agents hits some version of this wall, because producing code got fast and cheap while the review process is still the one we designed for code humans typed.
That's why a few weeks ago we published a manifesto arguing that AI is not your peer. When an agent writes your code, the "peer review" of that PR is really just an audit of a machine. In fact, the artifact humans should be reviewing is the plan behind the change. Well, this is what that looks like in practice, from the unlikely perspective of a product manager, and how it changed the way I work.
The problem with being a PM who can suddenly "code"
Agents made it easy for me to produce code, but a PM generating code is old news by now. The story starts when my code met the team. The old loop went like this: I'd write a PRD, an engineer would interpret it, and what shipped was a translation of that translation.
When I started using agents to skip the translation and open PRs myself, I hit a different, more formidable wall: code review. My PRs collected comments that were, if I'm honest, reasonable. They just weren't about decisions I'd made. They were about implementation choices I didn't have, and didn't want to have, opinions about: which libraries, which patterns, or how they'd have structured it.
The engineers weren't wrong to comment, that's what code review asks of them. And it's the way engineering teams have been working for the last 15+ years. Look at a diff and form an opinion on every line. But most of those lines weren't really mine. They were the agent's guesses about details I'd never specified, and as a team we were now spending our most expensive conversations on the part of the work neither of us had actually decided.
This only compounded as I shipped more. A PR I opened in the morning might not get reviewed until the next day, and by then I'd be deep in other work or product and sales calls. Answering a comment stopped costing five minutes and started costing the half hour of loading the whole problem back into my head. And that's not just because I'm forgetful, but because in a startup you wear a lot of hats, and taking them all off to climb back into a decision you'd already agonised over is its own tax that was getting increasingly frustrating and inefficient.
Moving the review to the artifact I can defend
Around the same time it turned out that everyone else on my team was having their own separate pain with code reviews. So after a team trip to France where we all sat around and tried to figure out how to fix this, we ended up changing where review happens. This is a simplified version of what I, and my team, now do:
The old loop vs the loop now: review didn't go away — it moved.
1. I write a plan, not a ticket. I sit with a reasoning model, currently Opus, for sometimes hours in Cursor. We research the codebase, dig through customer data, and poke at the edge cases. I can get it to explain things simply or in a way that I'd understand. The output we produce is a plan which covers the business context, what's going to change, why, what it affects. Most importantly, our plan format forces every implementation detail that matters to be labelled explicitly in the plan. Nothing that matters is hidden in a diff. If an implementation detail matters, which library, which pattern, what happens on failure, it goes in the plan, where it's visible and cheaper to change / iterate on.
2. An engineer reviews the plan. This is the peer review. A human who knows the system reads the plan and pushes back on the approach and its blast radius, and catches the things I didn't know I didn't know. These conversations are better than any PR thread I've been in, because we're arguing about the decision before anyone has spent days building it.
3. An agent implements the approved plan. Because we have spent time working on the plan, I trust that I can use a faster and cheaper model (which my boss likes), currently Composer 2.5, to do the doing. The thinking has already happened. I don't really care about the keystrokes. Plus we have safety nets built in for this in the next section.
4. Agents review the code. We use Bugbot internally to review the code for bugs. We also created an internal tool, Brent, that provides a second check that compares the code against the approved plan and flags...