What's in a Codebase?

brilee1 pts0 comments

What's in a codebase?

Brian Kihoon Lee

Essays

What's in a codebase?

2026-03-23

Tagged: llms, strategy, software engineering

Does it ever make sense to rewrite your codebase from scratch?

For decades, the answer had been an unambiguous no, ever since Joel<br>Spolsky argued that rewrites were the “single<br>worst strategic mistake that any software company can make”.

In the era of coding agents, the cost of writing code has<br>dramatically shifted, making it possible to rewrite your codebase from<br>scratch, every week, if you really wanted to. But “possible” and “makes<br>sense” are not the same. In this essay I explore the value of a<br>codebase.

The compiler analogy

We’ve been here before - several times, actually. C codebases are ten<br>times shorter than the assembly that they compile to, and the generated<br>assembly code is worth approximately nothing compared to the C codebase.<br>Decades later, Python codebases are ten times shorter than the<br>equivalent C code, and few are weeping for the C codebases they<br>replaced. A spec might be yet another ten times shorter than the Python<br>code, with coding agents serving as the “compiler”.

At each level of compression, detail is necessarily lost<br>(historically, the low-level implementation tricks required to extract<br>maximally performant software). If you couldn’t tolerate that lossy<br>compression, there was always the option of inlining assembly into C, or<br>embedding C into Python. Today, coding agents fail to generate maximally<br>simple code, often generating redundant copies of code, or having<br>torturous data flows instead of refactoring the underlying information<br>architecture. Perhaps we’ll have to inline Python code into the<br>spec.

The coding agent works well as a decompression algorithm because it<br>contains humanity’s collective knowledge of different coding patterns,<br>algorithms, and techniques. You can invoke that knowledge with a single<br>word – if you happen to<br>know the right word. Agentic programmers of the future may have to<br>learn an encyclopedia of programming patterns and techniques and when<br>they are applicable, to be effective at their jobs.

The compilation analogy extends even further - just like many build<br>systems allow incremental recompilation of the parts of your program<br>that changed, you can also imagine having a agent take a text diff on<br>your updated spec, and incrementally update an existing codebase, rather<br>than rewriting from scratch.

Coding agents and specs

I’ve been using the word “spec” loosely, but what is a spec,<br>actually?

One answer is an extensive test suite: We’ve seen a few examples of<br>this already (vinext,<br>chardet); given<br>an exhaustive set of unit tests / API specs, an agent can rewrite the<br>codebase, possibly in a completely different language or context. In<br>response to these demos, some companies are considering pulling their unit<br>tests from their open-sourced code – although I should note that an<br>existing codebase can be fuzzed to regenerate a unit test suite, so you<br>may as well pull the whole thing! SQLite is a notable outlier<br>here - their test suite is 99.8% of their codebase and they’ve kept<br>it private since inception, despite keeping the source code<br>public.

One notable failure of this approach is Anthropic’s C<br>compiler exercise, in which the agent succeeded in writing a C<br>compiler that compiled Linux against several architectures (wow!), but<br>due to a lack of clean internal abstractions, it wasn’t<br>likely to compile anything else, and had major performance<br>shortcomings.

Perhaps what that attempt needed to complement the unit tests was a<br>design doc, with key architectural decisions laid out. This would<br>provide the core of the software, while the unit tests covered the<br>periphery.

Still, we’re missing detail. What about comments, like<br>## This call is expensive - only invoke when X is true, or<br>the wisdom embedded within historical commit messages? What about the<br>bugfixes, feature requests, and performance fixes recorded in issue<br>trackers or version release notes? Q/A knowledgebases, FAQs, and<br>user-facing manuals contain info about user-facing edge cases and their<br>current or desired resolution. Simply scraping this content would be<br>futile - only 1% would actually be valuable, and the rest would either<br>be obsolete, redundant with the spec, or mutually contradictory.

You could drop this level of detail from the spec and gain incredible<br>feature velocity, but that would result in buggy, nonperformant software<br>that only has 2 9s of reliability. Maybe every developer in the world<br>would use it anyway, who knows? shrug

Codebases coevolve with<br>people

To expand the definition of “spec” even further, there are many ways<br>in which even having the codebase as spec is still an<br>underspecification.

Codebases exist alongside people: the engineers, of course, but also<br>the on-call, the end user, the support team, and so on.

Software that’s often used on the go will develop tolerance to flaky<br>internet connections. Software that’s used intimately by a small...

codebase code spec software coding from

Related Articles