TokenBudgeting: Our Conversations with Enterprises on Token Spend

SubscribeSign in

TokenBudgeting: Our Conversations with Enterprises on Token Spend Was Widespread TokenMaxxing Ever Really Here? Crystal Huang, Joey Brookhart, and Dylan Patel Jun 30, 2026 ∙ Paid

115

It’s been reported that token consumption inside of enterprises is hitting a budgeting wall after unhinged consumption earlier this year. The SemiAnalysis team talked with over 50 customers by slack, phone, and at the Databricks AI Summit to understand trends within the enterprise. Widely reported responses to Tokenmaxxing budgets from companies like Meta and Uber are overstated and stem from poor incentives and employee allocation we didn’t find present at other organizations

Budgets are now the new norm, but there’s no consensus number with budgets starting at $250 and going up to tens of thousands a month.

Companies are downgrading default models and turning off premium tiers while employees’ game subscription M365 Copilot usage to stretch their token allowance.

Rise and Fall of Tokenmaxxing

Tokenmaxxing started earlier this year when companies like Meta and Salesforce began encouraging their employees to consume as many AI tokens as possible to boost productivity. At Meta, an employee even built a “Claudeconomics” dashboard that ranked the top 250 power users in the company. The results showed that Meta employees consumed over 60T tokens over a 30-day period, with the single highest individual accounting for roughly 280B tokens. Employees started competing for rankings like “Token Legend” and “Cache Wizard” by having agents do research for hours simply to burn tokens. The dashboard was shut down 2 days later after The Information reported the spend. That episode was just one amongst others in the enterprise tokenmaxxing trend in 1H26. Companies are now shifting focus from tokenmaxxing to token budgeting. Most recently, Uber made headlines for burning through their Claude Code and Codex annual budget in four months. In response, the company imposed a $1,500/month/employee limit, with over-limit requests allowed and approved on a case-by-case basis. To see if the news reports on early 2026 tokenmaxxing and now tokenbudgeting were true, the SemiAnalysis team conducted on-the-ground conversations at the Databricks AI Summit and talked with large enterprises to understand the trends. SemiAnalysis is a reader-supported publication. To receive new posts and support our work, consider becoming a free or paid subscriber.

Our View of the Data & Narrative

There are many news stories out there on tokenmaxxing and resulting budget blowouts. However, in our work in the Tokenomics Model, we estimate that 90th+ percentile customers make up most of the revenue and are at very little risk to API revenue cuts through the rest of the year. Even Meta, who was burning through 70T tokens per month in February and is spending close to at least $50,000/year per employee (at list price) is only a 3-5% customer for Anthropic per our estimates. Ramp data shows a similar trend amongst the top customers. 99th percentile customers spend almost $90,000/yr per employee while 90th percentile customers spend ~$7,300. This is a stark contrast to the median Ramp customer spending just $136. Note Ramp customers are generally way more tech forward so it's already a high spend skewed distribution. The media fortune 500 is well below $100 per employee still.

Source: Ramp Economics Lab Our conversations with enterprise customers, including many Fortune 500s, followed this split. Many tech forward Fortune 500s spend well under $2,000/year per employee in AI, with the larger spend mostly in the engineering and data science departments. This suggests that the s-curve of growth in enterprise usage still has plenty of runway. Today’s market is driven by the coding vertical explosion and other VC-backed AI companies whose products build on top of Anthropic or OpenAI models (90th-99th percentile customers who are seeing their revenue accelerate too). What the coding market has done to AI Lab ARR will be repeated with Cyber (Mythos re-release dependent) at an even faster pace than Claude Code and again with white-collar knowledge work as Cowork, CoPilot, Codex, and Computer type products penetrate the enterprise. The Tokenomics Model estimates coding related spend at the AI Labs on a 1st and 3rd party basis and ARR and margins at the application layers (companies like Cursor, Loveable, GitHub CoPilot, and more) to help investors and corporates track growth of and within the vertical. We believe that over 70% of ARR today across OpenAI and Anthropic can be attributed to coding use cases with Anthropic spend higher than OpenAI’s given the difference in B2B vs B2C mix (Anthropic 90%+ B2B vs OpenAI 60%).

Still, cheap tokens are in widespread demand. We see massive growth in spend across both the for the Token-as-a-Service (TaaS)/API Endpoint Market for both...

TokenBudgeting: Our Conversations with Enterprises on Token Spend

Related Articles

(no title)

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

Apertus – Open Foundation Model for Sovereign AI

The labor share of income in the US is at its lowest post-war level