What I’m Finding About LLM Code Style and Token Costs - Jim Montgomery jimmont.com
What I’m Finding About LLM Code Style and Token Costs
Spending output tokens to share it. Before the price spikes.
Jim
Where This Started
I’ve been working through creating and reviewing features with Claude the past year. It’s been remarkable seeing the tension in token consumption and legacy patterns. Right when I think something is complete, a problem surfaces—regression, edge case, whatever. All the while watching the slow, steady and natural march toward eventual full-price rates. Alongside this phenomenon, my accumulated push to stay at the pragmatic edge of modern Web work. The sweet spot where nearly ubiquitous features remove lines of code and improve quality—the place where I keep wondering: why did I get that output? Why did that line of code appear instead of what’s been available for years? I usually dismiss it with the observable fact that Claude is effectively junior level at best, and a useful approximation of the encyclopedic knowledge asked in interviews.
In trying to make progress on something I am finding myself reviewing my practice and looking at where that outrageous token usage is coming from. Every one of those is output tokens, the ones that cost several times more (3x to 5x!!!) than input tokens in API pricing. Patterns that are longer, more fragile, more insecure, and solving problems the platform already solved–often years ago.
It’s enough to start imagining there’s some conspiracy to take the entire web platform backward, right when Ryan Dahl and separately Alex Russell, Dimitri Glazkov (and many others) made Web Components, etc. They literally made the entire Web platform great again. All to eke out some return on the tokens. So for the sake of conspiracy, this is what I’m finding.
Because my background as human being, who uses language, designed typography, programmed early on, alongside drawing and many other eclectic oddities, I actually consider things like tabs as a remarkable innovation. I can literally reduce indentation to 1 character, not some abstraction I have to go ask someone how to define or get permission to use. (I guess I’m just far too egalitarian to appreciate the exclusionary attitude of the entire software community.) I care about humans, and want things to work within some parsimonious baseline. And multiplying stuff by 4 or some arbitrary number just really doesn’t make sense–to me. I could go on, but maybe this grounds the orientation—someone who’s worked with actual language on actual media and has opinions about when something works and when it doesn’t. That part tends to speak for itself.
I mention this because it colors what I looked into from a purely pragmatic standpoint. I’m not arguing for a specific position where everyone uses tabs (despite that speaking for itself). I’m disclosing background that shaped opinions I’d been sitting on—there was always an economic argument I kept to myself, and it’s now showing up in real API costs. My opinions on convention are not the article. The token usage optimizations are what I came here to share. So you can benefit too. If you want to keep using multiple spaces, I’ll remind myself that the literature said it seemed ok and the LLM doesn’t know any better.
The Easiest Token Optimization on the Planet Is Already in the Runtime
Deno and runtimes like Cloudflare Workers implement the Web API surface natively—URL, URLSearchParams, fetch, FormData, Headers, Request, Response, AbortController, ReadableStream, crypto, and more—the same objects that run in the browser. This is the architectural choice that Deno made deliberately, and that WinterCG has been formalizing as a minimum common API surface across runtimes and it has a significant practical consequence: the same API surface covers both browser and server-side code . No translation layer, no shims, no adaptation cost. The platform has already solved a large category of problems, correctly, securely, and without dependencies. Deno is particularly notable for including a standard library where something may be missing and needs cross-platform solutions.
The LLM doesn’t know this about your environment unless you say so. Its training corpus is dominated by Node.js code from before these APIs were universal—require('url'), querystring.parse(), express middleware patterns, axios with custom timeout wrappers, multer for form parsing. Those patterns are statistically dominant in what the model learned from. They’re what it reaches.
The gap between what the model defaults to and what the platform already provides is where most of the output token cost lives.
The Magnitude, by Pattern
I’ve been estimating the token economics of this as I go. These are approximate—based on the actual length of the patterns, not from a formal study—but the ratios are consistent enough to be useful.
Query parameter parsing
// model default—manual parsing (~140 tokens)<br>const parts =...