Agent-Friendly Interfaces Are a Token-Efficiency Strategy

Agents Want Filesystems: Agent-Friendly Interfaces Are a Token-Efficiency Strategy — NoKV Blog<br>← All posts<br>We ran the same ML-research tasks against two interfaces over identical data. The filesystem-shaped one used 45% fewer tokens, cost 39% less, and got more answers right. Here is the argument, the evidence, and what it means for your stack.

The week token economics stopped being a footnote

Two things happened in the same week of June 2026.

On June 9, Anthropic shipped Claude Fable 5. In Anthropic’s own words, “Apps that took a hundred prompts a year ago, it now one-shots.” The market reacted within a day: Amplitude, Atlassian, and Guidewire slid as analysts rehearsed the “SaaSpocalypse” thesis. If a $20/month agent can do long-horizon, multi-step knowledge work, per-seat subscriptions get hard to defend. The capability curve is doing to software what it already did to demos: when models this strong are an API call away, every serious enterprise becomes a potential builder of internal agents rather than a buyer of packaged workflows. (Futurum’s Rolf Bulk, back in February: “There’s likely to be cannibalization of SaaS by AI-driven workflows.”)

On June 10, the Linux Foundation announced Tokenomicon, an entire conference dedicated to the economics of AI, citing Goldman Sachs research that projects global token usage to multiply roughly 24x between 2026 and 2030 . The economics of tokens now has its own conference circuit. Jensen Huang has been saying this for over a year: datacenters are “AI factories” with one job, “generating these incredible tokens”, and by GTC 2026 the framing had hardened: “Tokens are the new commodity… your tokens are your commodity, and that compute is your revenue.”

Put the two together and you get the thesis of this post. If everyone can build agents, having agents is not a differentiator. What differentiates is the unit economics of every task your agents run. In our experience, the single biggest lever on that bill is not the model or the prompt. It is the interface your agent operates on . Frontier tokens are not cheap (Fable 5 lists at $10/M input, $50/M output), volume is about to multiply, and every wasted exploration turn re-bills the entire conversation.

An agent-friendly interface is a token-efficiency strategy. We benchmarked it. But first, the argument.

Part 1 — Do agents prefer filesystems?

In August 2025, Letta published a benchmark with a provocative title: “Benchmarking AI Agent Memory: Is a Filesystem All You Need?” A Letta agent on gpt-4o-mini that simply stored conversation history in files scored 74.0% on the LoCoMo long-conversation-memory benchmark, beating Mem0’s reported 68.5% for its top-performing graph variant, a tool purpose-built for agent memory. Their conclusion: “With a well-designed agent, even simple filesystem tools are sufficient to perform well on retrieval benchmarks such as LoCoMo.”

The intuition has been circulating among systems people too. Pekka Enberg, sketching a disaggregated agent filesystem on object storage, put it bluntly:

“Give an agent access to grep, sed, awk, cat, and git, and it becomes unreasonably capable and effective, requiring no custom tools.”

There is nothing mystical here. Filesystem and shell are among the most common computing interfaces in LLM training data, and the past two years of post-training have specifically optimized frontier models for agentic coding tasks. That is why coding agents consistently feel like the strongest agents anyone has shipped. The skills transfer: navigate a tree, grep for a needle, read what you found, cite the line.

A first, rough conclusion: agents want filesystems.

Part 2 — Why: the shape of an agent-friendly surface

The affinity is not just familiarity. A filesystem is a progressive-disclosure interface with stable handles : an agent first locates the thing by directory, by name, or by grep, and only then pays to read its content. Cheap discovery, lazy loading, composable steps. SQL is excellent at relational queries and aggregation, but in the locate-the-handle phase it front-loads cognitive cost: schema comprehension, join semantics, field naming, and query composition. The agent pays for all of that in tokens and in error probability before it has found anything.

The two biggest model labs have both, independently, converged on this shape for their own surfaces:

Anthropic showed in Code execution with MCP that presenting MCP tools as a TypeScript file tree (servers/google-drive/getDocument.ts, …) instead of a flat tool list cut a representative workload “from 150,000 tokens to 2,000 tokens — a time and cost saving of 98.7%.” Their explanation points at the same interface shape: “presenting tools as code on a filesystem allows models to read tool definitions on-demand, rather than reading them all up-front.” The same progressive-disclosure principle drives Agent Skills: a skill costs ~100 tokens of metadata until the agent actually opens it.

OpenAI ’s tool search...

Agent-Friendly Interfaces Are a Token-Efficiency Strategy

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

Apple WWDC 2026 Livestream

Claude Fable 5

US Government directive to suspend access to Fable 5 and Mythos 5

It's Not Just X. It's Y