Cheap cloud AI was never going to last. The off-ramp wasn't built for everyone

Cheap cloud AI was never going to last. The off-ramp wasn't built for everyone. – Great Work EveryoneSkip to content<br>← All posts

Anthropic just announced self-hosted sandboxes and MCP tunnels for Managed Agents. Enterprise stack getting new product and expanded control the same fortnight the consumer side gets re-costed is the bifurcation we all need to be talking about.

On June 1, GitHub Copilot moves to usage-based billing. On June 15, Anthropic splits programmatic Claude Code off the subscription onto a separate metered credit and the $200 Max plan gets $200 of compute on top. The flat-fee era is ending within a two-week window. Heavy users have been consuming roughly $8 of compute per $1 of subscription revenue; the labs aren't covering that anymore. I'm migrating to self-hosted open source LLM for the workflows that need it. Most people don't have that fallback, and the off ramp for the cloud AI highway isn't accessible to the average user. The questions worth asking: who gets left where, and was the door ever meant to stay open?

The skill toll on the off-ramp

The consumer AI you use has been subsidised. For heavy users, the subscriptions cost less than the compute they consume. The VCs foot the bill. This isn't a secret. It's the product strategy: buy market share now, raise prices once the switching cost is high enough that customers stay. For AI, this cost is dependency on the workflow. The more you've woven the tool into your day, the harder it is to unpick. This is textbook loss-leader pricing – see Uber, Amazon – but still worth appreciating.

The alternative to subsidised consumer AI exists. Open-source models keep getting better. Local AI on laptops and phones is now viable for most everyday tasks. Self-hosting is a real option if you have the skills, the hardware, and the time to maintain a stack.

I do, to a point. We run an open-source model locally – the kind you download once and operate on your own machine, no subscription, no per-message cost – and point it at our email and internal tools for the work that doesn't need a frontier model. It keeps our cloud bill manageable and means we're not locked to one provider. Most people don't, and don't want to. Standing up a local model (Meta's Llama, Alibaba's Qwen, or similar) and getting it useful is a weekend's work for someone with a developer background, an indefinite barrier for those without, and is somewhere in the middle if you're feeling brave.

Which means the off-ramp has a skill toll that excludes the user base the labs have been onboarding. The cheap AI phase taught our workforce that AI is the inescapable future of work. The re-costed AI phase will hurt businesses who thought they had discovered the productivity jackpot, and will leave individuals on the wrong side of a price they can no longer afford but can't circumvent.

Who the new prices price out

Two groups end up where the prices want them. Enterprise users keep paying because the labs will price the enterprise tier at exactly what corporate procurement will accept — and increasingly build it for whatever corporate security will accept. Wealthy individual users keep paying because the consumer tier will land at a number that's a footnote in their monthly ledger.

Anthropic shipped a Managed Agents update today — execution in the customer's own VPC, internal tools reachable without exposing them to the public internet. This is a commodification of control that the consumer camp won't see. The enterprise stack is getting product, while the consumer side is getting bills.

It's the middle that get squeezed. People for whom AI tools have become productive but who can't absorb a 5x or 10x price increase. The independent contractor whose $20 subscription runs compute worth multiples of that, who rebuilt their client workflow around a tool that will now cost them their margin. Small business owners. Students who built study workflows around the current free tier. Vibe coders running businesses on top of cloud AI wrappers, because they were told that's how you survive now. The people who, if you took AI away tomorrow, would feel it as a productivity loss they can't backfill.

That group's only options are to pay the new price, drop back to a free tier that will become more constrained than today's, if not obsolete, or learn to self-host. The first they can't afford, the second is degraded by design, and the third has a toll many can't pay. The off-ramp doesn't accommodate them because it was never built to. Access to AI that actually works is about to become a wealth marker, and the people it helped most are the first ones priced out.

What the labs are betting on

Cancelling won't be cancelling a tool, it'll be fundamentally reorganising how you work. For now, most people won't, especially when the subscription is priced just under the pain threshold.

OpenAI's Nick Turley, VP and Head of ChatGPT, has called the original pricing something they "stumbled into," comparing...

Cheap cloud AI was never going to last. The off-ramp wasn't built for everyone

Related Articles

Elevated error rates on requests to multiple models

Donald Trump and sons to be 'forever' exempt from tax audits

PopuLoRA: Co-Evolving LLM Populations for Reasoning Self- Play

Old Reddit Is Down

The ultimate female fantasy – A feminist critique of Beauty and the Beast