Testing is fucking awesome | Michael TrombaI used to have a pretty negative sentiment on testing. Unit tests, integration tests, anything that required writing code to test other code I already wrote.
Not because they are not useful.
But because I found that in the majority of cases, the cost-benefit ratio of writing them was unfavorable given the context of the things I was building as a solo, bootstrapped founder.
I would still write them for specific, fragile, high-risk code paths as a way of clearly defining expectations, guardrailing my implementations with a red-green approach, and preventing future regressions.
E.g. if I was writing some kind of glob matcher, sure - I'd want to make sure it wasn't behaving stupidly.
But in most cases they were a huge waste of time and introduced even more surface area for bugs that I'd have to maintain indefinitely.
They also added a layer of cement around my codebases, making things way harder to change - the defining hallmark of a shitty codebase.
But not anymore.
Now, I cannot imagine a world without them. I write a ton of tests - of all kinds.
And by write, I mean, Opus 4.6 or GPT 5.4 writes them for me.
Why? The cost-benefit ratio has completely flipped on its head.
The costs are now negligible, and the benefits have been multiplied.
1. They have become effectively costless.
The old costs:
Cognitive burden and time required to write them
New surface area for bugs
Double the code to maintain over time
Cement around my codebase
...are no longer even close to as costly as they were pre-coding agents.
Now, there is close to zero cognitive burden to writing them or even thinking of which cases to write in the first place. The LLM is exceptional at defining cases.
The increased surface area and maintenance is now almost costless - the AI can write new tests, add cases, and fix broken cases in seconds.
They are no longer cement. The minute a test is no longer serving its purpose, the agent can easily bulldoze over it and write a new one from scratch. rm x.test.ts, done.
2. The benefits are now amplified.
Feedback loops are the essential backbone of effective agentic systems.
When the agent creates a mutation, it needs a way to understand the impacts of its mutation, validate its correctness, and confirm it has not introduced any regressions.
Automated testing is the perfect tool for the job.
Sure, you could tell your agent to verify its work and hope it listens. And this works to some degree, especially with the frontier models. It will run for 10 mins straight executing little ephemeral shell scripts and commands and reading / judging the effects of its work.
But that process is fragile, fundamentally not reproducible, and does nothing to prevent future regressions once you have defined expected behaviors.
Alternatively, you can instruct it to write a comprehensive test suite that captures all important code paths and edge cases for the module in question.
And on top of that, you can wire up your testing harness to a precommit hook (or some other similarly useful engineering lifecycle trigger) and have the tests run automatically prior to any potentially bug-containing code being shipped.
Now, you have not only created feedback for your coding agent so that it can more effectively write code that conforms to clearly defined expected behaviors. But you have also created a living specification for the given module to follow for the lifetime that it exists within your codebase. If its behavior changes or breaks suddenly - you will know immediately, before it hits users.
The amount of code we ship is accelerating exponentially
I no longer review every line of code that hits my codebase. At most I'll skim through and only look more deeply into the risky bits, mostly to ensure that the architecture is sound and maintainable and that my codebase isn't trending toward becoming a big bowl of spaghetti. But I rarely read the actual code tokens beyond entity identifiers and file names. AI is generally good at naming things, so I can quickly deduce what and how a thing works just by glancing at the names and general flow of things.
If this makes you cringe, and you're still reviewing line-by-line - that's likely because you have not yet adopted and seen for yourself the power of well-engineered testing & verification systems within your codebase. The reason I do not need to read the code is that I already know it works and does what it says it does because we have sharply defined test suites which confirm that better than I could with my own two eyes, reading line-by-line.
Ok maybe not always better - but sure as hell faster. 100 or 1000x faster.
My brain validates the high-level intent/architecture (how it works), and automated systems validate the rest (IF it works).
A few pointers
Iteratively creating test suites with the agent
The process of integrating tests into a codebase often looks like walking through a manual verification and...