The Tests We Skipped - Ajey Gore
Menu
The argument I lost for twenty years is the argument the industry is now winning, by accident, through the back door.
On this page
A position I held for twenty years
For most of my career I argued for writing tests.
Not quietly. In standups, in team rooms, in conference talks, in 1:1s with engineers I was trying to convince. I pushed for trunk-based development. I pushed for unit tests as the way you knew you hadn’t broken something that worked yesterday. I pushed against the pull request culture as it actually got practiced — not against review, but against the theatre of review we’d built around it.
Most people didn’t listen.
That’s not a complaint. That’s the starting condition. The conversation about AI and engineering has, in the last six months, started circling back to almost every position some of us were holding for twenty years — quietly losing the argument every quarter, watching the bug tracker fill up, getting the half-eye-rolls in standup. The new wrinkle is that the people rediscovering these positions are calling them new things. Spec-driven development. Eval-first engineering. Verifiable execution. New labels. Same practices.
This time, the industry doesn’t have a choice.
The argument for tests, in plain language
The argument was always simple, and I’ve made it a thousand times.
You write a unit test for two reasons. One: you want to know, today, that the code you just wrote does the thing you think it does. Two: you want to know, six months from now when somebody else changes something nearby, that the thing your code did then still works now. That’s it. Tests are not paperwork. They’re the thing that lets you sleep on Sunday.
The bug that comes back. The regression nobody saw coming. The release that broke a feature nobody was thinking about. Tests are the only protection any team has against those, and the protection is cumulative — every test you write today is a test that catches you next year, next refactor, next migration. Nothing else in software has that property. Documentation rots. Tribal knowledge walks out the door. Tests, if you keep them green, just keep working.
The argument against tests was always the same: tests waste time. I’d rather ship.
What that argument always dropped was the time on the other side of the ledger. Manual QA, run by hand, every release. The same regression test, executed by a human, again and again and again, because the team didn’t trust their automation enough to skip the manual pass. The bug that came back in production six weeks later because nobody wrote the test that would have caught it. The 4am incident. The customer escalation. The Sunday spent rolling back a release.
You weren’t saving time. You were deferring it onto people you weren’t counting in the same column. Often QA. Often support. Often the on-call engineer at 2am. The cost was always paid. The lie was that it wasn’t.
The PR ritual, and the responsibility it quietly moved
This is the other position I lost for twenty years. The agent conversation is finally surfacing it.
The argument: pull requests, as practiced in most teams, do not catch bugs. They perform the catching of bugs.
Here’s how it actually works in most places. An engineer writes some code. Opens a PR. The senior engineer assigned to review pulls up the diff in the browser, scrolls through it, leaves a comment about a variable name, suggests a tiny structural tweak, hits approve. They almost never check the branch out. They almost never run the tests against it. They almost never trace a function through the codebase to see what else it touches. The review takes ten or fifteen minutes — for changes that took the author days to write.
That ten-minute review accomplishes something almost worse than nothing. The author of the code, the moment the senior engineer hits “approve,” mentally transfers responsibility for the change. They wrote it. The senior reviewed it. It must be fine. When it breaks two weeks later, nobody quite owns it. Two pairs of eyes were on it. Neither pair did the actual work of verification. And nobody on the team wants to be the one to say so out loud.
I argued for years that this was upside down. That the right model was simpler: the person who writes the code is the person who pushes it to production. Full stop. Review is a conversation, not an absolution. The confidence to push has to come from the author’s own knowledge that nothing is broken — and the only way an author can know nothing is broken is if there are tests, run automatically, in a build they trust, on every commit.
No tests, no confidence. No confidence, no push.
The PR-as-rubber-stamp pattern was a comfortable lie that let everyone feel good about the process without anyone doing the work the process was supposed to be doing. It was a way to look like we cared about quality without paying the cost. I lost that argument too. Most teams kept the ritual. Most teams kept...