The token bill comes due

The token bill comes due: Inside the industry scramble to manage AI’s runaway costs | TechCrunch

SearchSubmit

Site Search Toggle

Mega Menu Toggle

Topics

Latest

Amazon

Apps

Biotech & Health

Climate

Cloud Computing

Commerce

Crypto

Enterprise

EVs

Fintech

Fundraising

Gadgets

Gaming

Google

Government & Policy

Hardware

Instagram

Layoffs

Media & Entertainment

More from TechCrunch

Staff

Events

Startup Battlefield

StrictlyVC

Newsletters

Podcasts

Videos

Partner Content

TechCrunch Brand Studio

Crunchboard

Image Credits: Getty Images

The token bill comes due: Inside the industry scramble to manage AI’s runaway costs

Rebecca Bellan

7:49 AM PDT · June 5, 2026

Across the industry, companies are starting to balk at the price of AI. Uber blew through its entire 2026 AI coding budget by April. Microsoft revoked its developers’ Claude Code licenses months after enabling them. A Priceline employee told TechCrunch that a routine Cursor contract renewal came back 4-5x more expensive.

Even though per-token prices have fallen, the push for more AI adoption and increasingly autonomous agents have driven token consumption higher and higher. Companies that gorged themselves in early 2025 on all-you-can-eat subscriptions are now scrambling to understand where their money is going, pull back spending, and figure out whether they can salvage some ROI from the wreckage of their budgets.

Meanwhile, a market is forming to meet them there. Startups, established vendors, and a new standards body are all racing to give companies the tools and language to track what they spend.

"Six months ago, I would have a conversation with a customer and it would be all about ‘What can it do? Is it good enough?’" Alexander Embiricos, OpenAI’s head of enterprise, told TechCrunch at an event in New York City this week. "Our conversations are never about that now. Now the conversations are about, ‘hey, we’re spending so much. What visibility do you have? What auditability do you have? What token controls do you have? What is the efficiency of your models?’"

It’s against this backdrop that the Linux Foundation this week unveiled plans for the Tokenomics Foundation, a new standards body that aims to instill the same cost discipline around AI tokens that FinOps did for cloud spend.

“In April and May, I started hearing from companies: ‘Oh my god, we are 3x over our entire 2026 token budget and it’s only April,’” J.R. Storment, executive director of the FinOps Foundation, a project under the Linux Foundation, told TechCrunch. “We started hearing existential crises, and the whole conversation shifted from tokenmaxxing and ‘go fast’ to ‘we need guardrails, how do we control this?’”

The cries heard round the tech world followed fervent demands from CEOs pushing their teams to use the best models and move fast, costs be damned. New models released in November like Anthropic’s Claude Opus 4.5, OpenAI’s GPT-5.1, and Google’s Gemini 3 Pro brought significant improvements to agentic tools, which have multiplied consumption. It’s how one company reportedly found itself with a $500 million Claude bill after forgetting to set usage limits for employees.

"It’s like the crack-cocaine epidemic,” said Chris Reed, senior director of IT finance at Priceline, noting the company had begun placing token limits on certain groups. “They let you try it to get you hooked on it, and now you’re kind of beholden to it.”

Vitaly Gordon, CEO of engineering operations platform Faros AI, said he recently spoke to a CTO who told him: "One of my engineers spent $40,000 on tokens last month, and I genuinely don’t know whether I should stop him or should I go and tell everyone else to be like him.“

A March survey by Faros found that among 20,000 developers, output was rising, but so were bugs and rewrites. Jellyfish, an engineering management platform, similarly found engineers who used the most tokens were about twice as productive as those who used AI less, but they spent 10x the number of tokens to get there.

Nicholas Arcolano, head of research at Jellyfish, told TechCrunch via email that expenditure on AI is exploding in large part due to agentic features, with per-developer consumption rising about 18.6x in nine months. All in all, these stats make the productivity case murkier than the spending suggests.

“Whether extreme spend pays off comes down to the ultimate business value of shipped code (e.g. revenue), which most companies still can’t measure,” Arcolano said.

At least some of that measurement issue is the sheer scale at which AI is being used today.

“Tracking cloud costs is a hundreds-of-millions-of-rows-a-month data problem,” Storment said. “Tracking token costs is a trillions-of-rows-a-month data problem. You can’t just stick that into whatever spreadsheet or even basic tool. You’ve got to fundamentally...

The token bill comes due

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

It's Not Just X. It's Y

Amazon, Facebook, FBI have access to a private intelligence-sharing network

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy