Lines of Code Got a Better Publicist

Lines of Code Got a Better Publicist | David Curlewis

It’s fifteen years ago (bear with me, I’ve been in this industry since the late 90s, most of my good stories start this way), and you’ve got two senior developers at a SaaS company. One of them writes 40% more lines of code than the other. Is that developer better? More impactful for the business? Should the other one be polishing their CV?

Of course not. You’d want to know what actually shipped. What it did for customers, for revenue, for reliability. Lines of code, PR counts… we spent a couple of decades learning these are stereotypically bad ways to measure a developer, to the point where suggesting them today is laughable.

Sooooo… Here’s what the industry put on the billboard this year:

Google: 75% of new code is AI-generated

Anthropic: ~80% of merged production code is written by Claude , and engineers ship “8x more code per quarter”.

OpenAI: also ~80% , apparently.

Cursor: “100M+ lines of enterprise code written per day”

Every single one is a volume claim. “Percent of code written by AI” is just lines of code with a better publicist. (The sceptic in me editing this draft would like to point out that it’s no coincidence that all of these are AI vendors of some kind, so pumping adoption is pretty important to them.)

We used to claim outcomes

Rewind a few years and the headline number was different in kind, not just size. GitHub’s flagship claim was that developers completed tasks 55% faster with Copilot. Say what you like about that study (plenty did), but it was an outcome claim. Bold, falsifiable, about value. If it was wrong, you could show it was wrong.

The 2026 claims can’t fail. That’s the genius of them; “75% of our code is AI-written” could be true, and will keep going up, regardless of whether anything got better (faster delivery, fewer incidents, happier customers, etc). A volume number can only ever disappoint you if adoption stalls, and adoption is the one thing most of us agree is real. 📈

So the claims got bigger and started saying less. What happened in between?

The bit nobody puts on a billboard

The outcome evidence got complicated, that’s what happened.

The strongest pro-adoption result is still Cui et al. ; nearly 5,000 developers, +26% completed tasks, with the biggest gains for junior devs. Not really in dispute. But then GitClear showed code churn rising and refactoring collapsing as Copilot adoption deepened. Then METR ran the study many have quoted: experienced open-source devs were 19% slower with AI in their own codebases, while believing they were 20% faster.

But! Hold my beer… in February 2026 METR effectively walked it back : their follow-up estimates flipped to a speedup (with error bars wide enough to ride a Moto Guzzi, with panniers, through!), and they abandoned the study design entirely - because developers now refuse to work without AI, and can’t reliably self-report time on agentic work. Their latest position: AI probably speeds developers up in 2026, and we can no longer cleanly measure by how much.

Meanwhile at the company level, an NBER survey of ~6,000 executives found 69% of firms actively using AI and roughly nine in ten reporting no measurable productivity impact. The cross-study consensus sits somewhere around 10% organisational gains. Not nothing! Still bloody useful! Buuuut, also not “you don’t need developers anymore” territory.

And if you’re a sceptic still quoting “19% slower”, you’re cherry-picking too. The research keeps updating; the industry just changed what it counts.

Vanity metrics, now in AI flavour

It’s not just AI vendor claims, to be fair. Carnegie Mellon’s SEI and Accenture launched an AI Adoption Maturity Model just a few days ago: five levels, eight dimensions, marketed off a stat about 95% of organisations seeing no returns. Steve Yegge’s “8 levels of AI-assisted development” ranks you by which tools you run and how much supervision you give them. And every tools vendor now ships a maturity ladder whose top rung is, usually, “use more of our product”. These ladders measure adoption intensity and call it maturity. Same substitution, nicer packaging.

My favourite data point in this whole genre: Augment surveyed 219 engineering leaders and asked them to define “AI-native engineering” . They got 219 different answers. 🫠

And the prize for holding both ends of the rope goes to Anthropic, who gave us the “8x more code shipped” claim and one of the more rigorous studies of the year: an RCT finding that AI-assisted developers scored 17% lower on comprehension of the code they’d just shipped, with no statistically significant productivity gain. I use Claude every single day (it recommended half the links I read for this post, so the irony is not lost...

Lines of Code Got a Better Publicist

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

Apple WWDC 2026 Livestream

Claude Fable 5

It's Not Just X. It's Y

Show HN: GoPeek – open links in live mini browser windows without new tabs