Only 18% of AI engineering spend reaches shipped product

Token Maxxing Is Making Engineering Teams Slower · Entelligence Book a CallSign up

An analysis of 1M+ pull requests across 2,444 engineering organizations. More AI spend. More code volume. More production failures. We measured where AI engineering effort actually goes, how code review has responded to 2.6× volume growth, and why the reactive work treadmill keeps accelerating. The findings: $0.82 of every AI dollar is consumed before a single feature reaches users.

Entelligence Research · May 2026 Token Maxxing Is Making Engineering Teams Slower

PRS Analyzed 1M+ 2,444 engineering organizations

Platform Avg Reactive Work 44% Bugs + maintenance, median

Revert Rate vs PR Growth 3.7× vs 2.6× Failures growing faster than output

01 / The Dollar Breakdown $0.18 Shipped. $0.82 Consumed Before It Gets There. Proportional allocation of AI engineering spend across work type categories · platform average

$0.44 $0.27 $0.11 $0.18

Reactive Engineering — $0.44

Code Rework — $0.27

Review Friction — $0.11

Shipped Product — $0.18

Bugs Patches and fixes to existing code. No net-new product value delivered.

KTLO Keep the Lights On — infrastructure, dependency upgrades, config maintenance.

Features New product capabilities shipped to users.

Innovation Architectural changes, R&D, significant new patterns.

At the current trajectory, an engineering team spending $100,000/year on AI coding tools generates roughly $18,000 of shipped product value. The remaining $82,000 is consumed by the maintenance cycle those same tools are helping to accelerate. This is not because engineers are inefficient or the AI tools are bad. It is because there is no closed loop between production reality and the code being written.

Per-dollar allocation · median and percentile breakdown CategoryPlatform avgP75P90What it measuresReactive Engineering$0.44$0.62$0.76Bug fixes + maintenance PRsCode Rework$0.27$0.38$0.55Code written and discarded within the weekReview Friction$0.11——Overhead from review that doesn't catch anythingShipped Product$0.18$0.10$0.06Net-new value that reaches users

02 / The Reactive Tax 44% at Median. 76% at the 90th Percentile. Share of engineering output classified as reactive work (bugs + maintenance) by organization percentile

Nearly half of all engineering output on the platform is classified as reactive — fixing existing code or keeping existing systems running. At the median organization, 44% of every PR is reactive. At the 75th percentile it is 62%. At the 90th percentile, more than three-quarters of all engineering effort goes toward work that produces no net-new product. These are organizations that, for every feature built, are also burning three-quarters of their capacity on maintenance. More AI spend accelerates the volume on both sides, not just the features.

Median reactive 44% Bugs + maintenance share · P50

75th percentile 62% Top quartile of reactive orgs

90th percentile 76% Highest-reactive organizations

Reactive work — organization distribution P1020%P2532%Median44%P7562%P9076% Proactive vs reactive split · platform avg Reactive (bugs + maintenance)43.9%

Proactive (features + innovation)24.7%

Unclassified31.4%

At the 90th percentile, organizations spend 4.2× more on reactive work than on building product. These organizations are not outliers — they represent the ceiling of what happens when AI volume grows without a quality feedback loop.

03 / Code Rework 1 in 4 Lines Written Each Week Is Thrown Away. Weekly code churn — lines written and discarded within the same week, by percentile

At the median, 25% of code written in any given week is overwritten or deleted before that week closes. This is not planned refactoring or technical debt cleanup — it is code that did not survive the sprint it was written in. For teams heavily using AI coding assistants, this reflects a structural gap: the AI generates code from local context (the file, the prompt, the immediate task) but not from production reality — which patterns have failed, which edge cases have already been tried and reverted, what the actual requirement turned out to be. At the 90th percentile, more than half of all code written each week is discarded.

Median weekly churn 25% Lines discarded within the week

75th percentile 38% Top-quartile churn orgs

90th percentile 55% Majority of code discarded

Industry benchmark 27% Pluralsight / GitPrime avg

Weekly code churn — organization distribution P2515%Median25%P7538%P9055%Industry benchmark at 27% (Pluralsight/GitPrime). Median matches; P90 is 2× the benchmark.

04 / Output vs Failure Rate PR Volume Grew 2.6×. Reverted PRs Grew 3.7×. 12 weeks of merged PRs vs reverted PRs · Feb 16 – May 4, 2026 · platform-wide

Between February 16 and May 4, weekly PR volume on the platform grew from 2,525 to 6,654 — a 2.6× increase. Over the same period, reverted pull requests grew from 10 to a peak of 37 per week — a 3.7× increase. The failure rate is growing faster than output. Each revert triggers a...

Only 18% of AI engineering spend reaches shipped product

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

It's Not Just X. It's Y

Amazon, Facebook, FBI have access to a private intelligence-sharing network

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy