Governing a codebase as a commons | Baris Erdem
What two Nobel economists, Elinor Ostrom and Oliver Williamson, get right about keeping a codebase coherent when AI agents write a lot of it.
A while ago I realized I had been rebuilding, by accident, something Elinor Ostrom won a Nobel Prize for studying.
I build software, and most of it now gets written with AI coding agents. Across three different codebases I noticed I had grown the same extra layer each time: a file called CONSTITUTION.md, a folder of numbered decision records, a checklist the agents have to follow, an audit playbook, and a CI check that gets stricter over time and never loosens. None of it was planned. Each part was a reaction to some specific way a codebase had gone bad on me. But when I put the three side by side, they have the same shape, and that shape has a name. It is a way of governing a shared resource.
This post is about what happened when I went back and read the two economists who actually understand this: Elinor Ostrom and Oliver Williamson, who shared the 2009 Nobel. I checked their ideas against the thing I had built. Some of it fit very well. Some of it broke in useful ways. And one part of my setup turned out to be a clean example of an idea Williamson worked on his whole career, which is the part I like most, so I put it in the middle.
The accidental institution
The problem that started all of this is common and boring. A codebase grows, every single change looks reasonable on its own, and the whole thing drifts anyway. Conventions split. The same idea gets three different names in three files. An agent told to fix a bug does the smallest thing that closes the ticket and leaves the code slightly worse than it found it. Do that a thousand times. One of my repos opens its constitution by naming the result, “systemic brittleness,” and then says the real point plainly: the problem was never any single bug, it was that there was no constitution. Nothing said what good meant here, so there was nothing to check a change against.
So I wrote one. Then the other two followed, because once you have the first part the rest start to feel necessary:
A constitution. Ten or twelve principles, each with a reason, examples of what breaks it, and a test that tells you when it has been violated. Versioned. You can only change it through a written process.
A decision log. Append-only records of the decisions we made (ADRs). You never delete a decision, you mark it as replaced and leave the old one in place. The point is that nobody, person or agent, has to reargue a settled question later, because the reasoning is still there.
An agent protocol. A short before, during, and after checklist that applies to whoever is doing the work.
An audit playbook. How to look for the problems a machine cannot catch, with a few set responses: fix it, write down an exception, change the rule, accept it as tracked debt, or rewrite the thing.
Enforcement in steps. A warning when you commit, a hard failure in CI.
In one of the three this went past documents. There is a small compiled governance program that scans the code on every pull request. It only catches the problems a regex can catch, and the constitution is honest that this is “a floor, not a ceiling.” It reads its own list of exceptions: a line of code that points at a decision record (// Per DEC-007, ...) gets downgraded from a failure to a note. And it has one feature I will defend to anyone: a ratchet. Once a kind of violation has been cleaned up to zero, the check turns into a hard block, for good. It looks about like this:
# while a violation class still has open cases, the check only warns<br>no_skipped_tests: warn # 12 existing, tracked as debt
# once you have driven it to zero, you flip it, for good<br>no_skipped_tests: block # 0 left, and now it cannot come back
That ratchet is where this stops being documentation and starts being governance. Keep it in mind, I come back to it.
Two economists who already solved this
Ostrom and Williamson shared the 2009 economics Nobel for their work on governance, Ostrom mostly on the commons and Williamson mostly on the boundaries of the firm. They worked on problems that look opposite and were, underneath, the same question.
Ostrom studied how real communities, like inshore fishers, valley irrigators, and alpine herders, manage a shared resource for centuries without either privatizing it or handing it to a central government. The standard theory said they could not. The “tragedy of the commons” said a shared resource gets destroyed by self-interest unless someone owns it or polices it. Ostrom went and looked, and found thousands of cases where ordinary people ran it themselves and did fine. The ones that lasted share a set of design rules: clear limits on who is in, rules that fit local conditions, the people affected get to set and change the rules, monitoring, sanctions that start small and grow, cheap and fast ways to settle disputes, a...