Compressed spec-driven development | Konstantin Borovik<br>Book a Call
blog
Compressed spec-driven development<br>May 13, 2026 · spec-driven-development · claude-code · ai-coding
TLDR
Spec-driven development, compressed via math-glyph notation so the full specification, open tasks, and bug history all fit inside a single agent’s context window. One file. One thread. No subagents.
Losing the thread
I kept losing track of what Claude was doing.
The pitch for multi-agent setups is throughput. Spin up subagents in parallel git worktrees, dispatch tasks, merge the wins. In practice I couldn’t hold the whole picture in my head: which task was done, how it was tested, what regressed, which bugs were still open. The agents moved faster than I could review, and review was the part that had to be right.
pilot-spec is what I built to fix that. It’s a Claude Code plugin that collapses everything I need to know about a project — goals, constraints, interfaces, invariants, tasks, bugs — into one root SPEC.md.
One file. One thread. No subagents.
Deliberately slower. That’s the point. Fanning out work to parallel agents is faster, but what I get in return for the slowdown is determinism and control — a single artifact I can read end-to-end before approving a diff.
The problem with parallel agents
A subagent fleet looks productive. While I’m reviewing one branch, the others are still working. By the time I finish, three more diffs are waiting and the plans behind them are scattered across disposable chat threads.
The breakdowns I hit, in order of how often they bit me:
Two agents touched overlapping code. The merge was clean. The behavior wasn’t.
An agent declared a task done because its narrow test passed. The test was wrong.
A bug surfaced in production. I had no record of the invariant it violated, so the next agent reintroduced it three weeks later.
I asked “is feature X done?” and the only honest answer was “let me read three plan documents and grep for the relevant commits.”
None of this is a Claude failure. It’s a process failure.
The information I needed in order to review safely was never collected in one place.
How pilot-spec works
The whole system is a single file at the repo root, SPEC.md, with six fixed sections:
§G GOAL — what this project is for, in one short paragraph.
§C CONSTRAINTS — hard rules: language, runtime, target platform, external systems.
§I INTERFACES — public surface: HTTP routes, CLI commands, exported functions, file formats.
§V INVARIANTS — properties that must always hold, numbered. For example: V12: ∀ req → auth check before handler.
§T TASKS — work, one row per task. Pipe-delimited: id, status, description, invariants it must respect.
§B BUGS — bug rows in the same shape, with date and root cause.
The encoding is math-glyph: ∀ ∃ ∴ ⊥ ≤ ∈ ∧ ∨ →, plus pipe tables and sentence fragments. It reads like notation, not prose. I am not the target reader of SPEC.md — Claude is. When I need to read it as a human, /sdd:explain decompresses any citation back into English.
The result is small enough to live in context. The current SPEC.md for MailPilot, my AI email automation project, is roughly 23,000 tokens — about 10% of a 200K Claude context window. The whole project — goals, constraints, interfaces, invariants, open tasks, bug history — sits in context for every prompt. No retrieval layer, no embedding store, no MCP server, no vector index. The spec is one file Claude reads at session start and writes back through /sdd:spec. There is nothing else to break.
The plugin exposes five slash commands. They map cleanly onto the loop I want:
/sdd:design — propose-then-critique a structural change. Writes a steno-encoded draft to designs/.md. I fold accepted designs into SPEC.md with /sdd:spec.
/sdd:spec — the only command that mutates SPEC.md. Behind it is a Socratic gate that classifies my input as NEW, DISTILL, AMEND, or BACKPROP, then writes the rows.
/sdd:build — plan, then execute, against SPEC.md. Single thread, no sub-agents. If a test or build fails, it auto-invokes the backprop protocol before retrying.
/sdd:check — read-only drift detector. Compares SPEC.md against the current code and reports violations, grouped by severity. Never writes.
/sdd:explain — math-glyph to prose. The escape hatch when a reviewer (usually me) needs the English version of §V.7 during code review.
The piece I rely on most is backprop . On a test or build failure, /sdd:build doesn’t just retry. It traces the cause, decides whether the failure class needs a new §V invariant, appends a §B row, and ships a failing test alongside the fix in one commit. Every incident becomes a permanent guard rail. The same bug can’t bite twice , because the invariant catches it on the next plan.
What I gave up
pilot-spec is slower. I am the bottleneck most of the time, because the spec edit, the design loop, and the single-thread build all run through me. There is no parallel work happening in the...