The Day Google Stopped Selling Software

FutureLab's Substack

SubscribeSign in

The Day Google Stopped Selling Software By Ashwin Krishnamoorthy | A FutureLab dispatch on Google I/O 2026

FutureLab by NST May 20, 2026

A note on method

I watched google I/O 2026 this year with my gemini agent. The agent parsed the full livestream from YouTube as it streamed, then I interviewed it through the announcements. Every number cited below was extracted directly from the keynote by Gemini reading the source. The structure of the argument is mine. The source material is, recursively, the same stack the article is about. That detail matters more than it sounds. Most I/O 2026 recaps were written by humans skimming press releases. This one was written by a human reading an agent reading the keynote. If you want to understand what changed yesterday, the way you read the news yesterday is already part of the answer.

The Shift Hiding Behind Thirty Product Launches

Yesterday Google held its annual developer keynote. Two hours, roughly thirty announcements, the usual mix of demos and stage choreography. The live blogs treated it as a product launch. It wasn’t. What Google actually shipped is a thesis: the era of reactive software is ending, and the era of autonomous, always-on agents has begun. Every announcement was scaffolding for that single idea. The TPUs, the model, the harness, the search redesign, Spark, Omni: these aren’t six product lines. They’re one stack, assembled end-to-end, designed to do one thing: take work off humans and put it onto agents that run continuously in the background. The interesting question isn’t what got announced. It’s whether anyone else on Earth can ship a comparable stack in the next eighteen months. I think the answer is no, and the reason it’s no tells you what to do on Monday morning. If you want three numbers from the keynote that capture the whole story, here they are. Each one proves something different. 3.2 quadrillion tokens processed monthly across Google’s AI surfaces.

This proves Google has successfully transitioned its existing user base from traditional search into generative AI consumption at a scale no competitor can match. $180 to $190 billion in 2026 capex, roughly six times what they spent in 2022.

This proves the infrastructure barrier to entry for frontier AI is now structurally insurmountable for all but two or three companies on Earth. Under $1,000 to build a working operating system using Antigravity’s swarm of 93 subagents.

This proves that the economic cost of complex, multi-day engineering work has effectively collapsed for anyone willing to architect for agents. The rest of the article is just walking the stack that produced these numbers.

The Stack, Built Bottom-Up

Layer 1: Silicon

Start here, because everything else follows from it. Google announced its eighth-generation TPUs and, for the first time, split the architecture in two. TPU 8t is optimised for large-scale pretraining. TPU 8i is optimised strictly for ultra-low latency inference, clocking close to 1,500 tokens per second. The two chips are stitched together by a system called Pathways, which lets a single training run span multiple geographic data centres and over a million TPUs as one virtual cluster. Pichai claimed this lets them train frontier models in weeks instead of months. The capex behind this is the part that ends the conversation about who can compete: roughly $180 to $190 billion this year, about six times what they spent in 2022. OpenAI and Anthropic are renting Nvidia silicon at Nvidia’s margins, on Nvidia’s roadmap, with Nvidia’s supply constraints. Google designs the chips, owns the data centres, runs them on its own power contracts, and tunes its models directly to the hardware. The cost curve underneath every other layer of the stack is structurally lower than anything a competitor can match. That isn’t a marketing claim. It’s an accounting one. The capex does have a ceiling, and it’s not financial. It’s the physical world: local power grid capacity, advanced cooling, and municipal permitting for data centres the size of small towns. Google can outspend everyone. They cannot outspend the planet. Layer 2: The model

On top of that silicon sits Gemini 3.5 Flash, which is the announcement most people will under-read. The model launched yesterday at $1.50 per million input tokens and $9.00 per million output tokens, generally available immediately, everywhere. Pichai’s framing: 3.5 Flash performs at roughly 90% of frontier quality, runs four times faster than comparable models, and costs one-third to one-half as much. Most strikingly, on coding and agentic benchmarks (Terminal-Bench 2.1, MCP Atlas, CharXiv Reasoning) it beats Gemini 3.1 Pro, which was Google’s flagship four months ago. Read that sentence again. The cheap, fast model now outperforms what was the flagship one quarter ago. That is what a working data flywheel produces. Google’s own internal usage on its...

The Day Google Stopped Selling Software

Related Articles

Elevated error rates on requests to multiple models

Donald Trump and sons to be 'forever' exempt from tax audits

PopuLoRA: Co-Evolving LLM Populations for Reasoning Self- Play

Old Reddit Is Down

The ultimate female fantasy – A feminist critique of Beauty and the Beast