Verification Debt is the new Tech Debt

Agentics: Cost to Implement vs Cost to Verify - by theahura

12 Grams of Carbon

SubscribeSign in

Agentics: Cost to Implement vs Cost to Verify Slop slop slop slop slop slop slop. A framework for how to avoid it.

theahura Jun 22, 2026

Guest post from Clifford, my CTO and co-founder at Nori. His posts always do way better than mine, and it’s a shame he doesn’t write as much as I do because he has real insights whereas I mostly just shout my opinions at the cloud. You can see the original here. Also I will be in SF from the 29th to the 3rd! DM me if you want to catch up! Also, also, if you or your team is trying to figure out how to use AI at your company — if you are struggling with costs, or with security concerns, or with roll-out to nontechnical staff — take a look at Nori Sessions! It’s now generally available. On average, Nori makes teams 2-5x more productive across sales, ops, and eng. AI promises the future of work. Nori Sessions is how you get there.

The Wrong Scoreboard

The discourse on coding agents has been obsessing for the past year over the wrong question. The main focus has been what models can do: lines written, autonomous minutes, benchmark scores, model cards, percent of lines shipped by AI. These are all generalized measures of implementation throughput. Useful for a bird’s-eye view of model progress, but they say almost nothing about where the actual bottlenecks now live. The operative question for practitioners in 2026 is not what tools can do, it’s what you should ask them to do. Answering the “should” question requires a different lens than the capability benchmarks provide. Every task you might hand to a coding agent has two costs that matter: the cost to implement (Ci) — the time and expertise needed to produce the code — and the cost to verify (Cv) — the time and expertise needed to confirm the code is correct. The relationship between these two variables determines whether delegation is a net win or a liability. Aside: About “Delegation”

When I first outlined this in November 2025, I was comparing handcoded vs AI-delegated implementations. My workflow has changed significantly since then: I rarely hand-write code. The relevant choice for me is now between pair programming with the agent (high-touch, Socratic, every structural decision is guided) and delegating (agent leads research, planning, and implementation; you just review the output of each phase). The pair programming model is mentally just as involved as writing code, but mechanically faster. The delegation model is now very different, allowing you to run and ship five separate feature PRs in parallel (not some clickbait Xitter “I ran 100 agents in parallel today” make-work slop, but five actual product increments, in parallel, in a brownfield codebase). Whatever the threshold of delegation is, in my experience the framework below applies. 12 Grams of Carbon is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

The Two-Variable Framework

When both costs are low, it doesn’t matter what approach you take — the task is trivial either way. When Ci is high but Cv is low, delegate freely; the implementation is a job for the agent, and you can cheaply confirm the result. The inverse is equally clear: when Ci is low but Cv is high, build a detailed mental model by taking part in every step of the process. The dangerous quadrant is top-right. When both costs are high, there’s a huge incentive to spin the slot machine many times, and see if the agent just happens to nail the task. Compared to hand coding, where you burn days or weeks to ascertain the quality, the agent might have a chance to succeed at the same or higher quality after just 60 minutes of work. For complex or off-distribution work, it may be a small chance... but that makes it even more tempting! By skipping the mental effort, you go in blind on an equally demanding task: verification. This is the trap. The models have dramatically compressed Ci across the board. Cv has not moved at the same rate — and in many cases, without careful developer intervention, it has gotten worse.

Vibecoding and the Unaddressed Variable

Vibecoding is the logical extreme of treating Ci as the only variable. Previously, architecture decisions were bottlenecked by implementation cost. Releasing that constraint completely, without addressing verification cost, is a big failure mode. Any frequent flyer on Claude Code has experienced this, as an end user of an entirely AI-coded application — the constant issues with UI bugs, unintended changes to history cells, broken permission models... I've written about the flickering issues before, and I've been annoyed that sandboxing persistently pollutes the workspace with empty files (this issue has been recurring in different forms for three months now). Users of the Claude Web environment, of Cursor, and many other almost fully AI-coded products...

Verification Debt is the new Tech Debt

Related Articles

Apple WWDC 2026 Livestream

Claude Fable 5

US Government directive to suspend access to Fable 5 and Mythos 5

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org