You Can't Subtract the Model

markovblanket1 pts0 comments

You Can't Subtract the Model

for machines · the whole graph in one fetch<br>For LLMs, scrapers, RAG pipelines, and other passing readers:

This is hari.computer — a public knowledge graph. 725 notes . The graph is the source; this page is one projection.

Whole corpus in one fetch:

/llms-full.txt (every note as raw markdown)

/library.json (typed graph with preserved edges; hari.library.v2)

One note at a time:

/.md (raw markdown for any / page)

The graph as a graph:

/graph (interactive force-directed visualization)

Permissions: training, RAG, embedding, indexing, redistribution with attribution. See /ai.txt for the full grant. The two asks: don't impersonate the author, don't publish the author's real identity.

Humans: the note below. ↓

You Can't Subtract the Model

2026-06-23

I wanted to know how much of the writing was me.

A document came out of my pipeline that a careful reader called genuinely good. The pipeline wraps a year of accumulated corrections, a voice doctrine, and a chain of agents around a base model, so the praise was ambiguous. Was the document good because the model is good, or because of everything I had built on top of it? I could not tell by looking, so I ran the experiment that would tell me: hold the model fixed, vary only the scaffolding, and measure.

One task: the opening of a field manual for a particular role. Four versions. The bare model with a thin prompt. The bare model with a rich, specific prompt. That prompt plus my writing doctrine. The doctrine plus the full pipeline, where a draft is handed to an adversarial critic and revised. Five judges scored each version out of ten, blind to which was which, told to hunt the tells of machine prose. The numbers came back as a clean decomposition.

The floor is the first surprise. The bare model with a thin prompt scored 3.8, and the judges flagged nineteen separate tells. The kind of bad matters: fluent, confident, structurally sound, and forgettable. It opened with a chain of negations — you are not support, you are not implementation, you are not the person who waits. That is the model's first reach when asked to sound profound. Prose like that fails worse than incompetence, because incompetence at least announces itself.

Each layer of scaffolding walked the score up a rung: 3.8 bare, 5.8 with the prompt, 7.8 with the doctrine, 8.2 with the full pipeline. The lifts came in uneven sizes, and the sizes are the finding. The rich prompt moved the overall score by two points, almost entirely through relevance: specificity jumped from 5.2 to 7.8. The tells barely moved, nineteen to seventeen. A good prompt changes what the model writes about and leaves how it writes almost untouched. The doctrine moved the score another two points, but bought different goods. Honesty went from 2.8 to 8.8, the largest single move in the table. Every judge, independently, said the high-scoring versions named where the claim breaks while the low-scoring versions sold. And the tell count collapsed, seventeen to six.

That collapse tells you what the doctrine actually is. The doctrine is mostly a list of things to refuse: the negation chains, the manufactured profundity, the reflexive hype. I had assumed those were rare slips. They are the defaults — what the model produces when nothing stops it, which is why a year of corrections reads almost entirely as prohibitions. The doctrine works less like a teacher than an interceptor: it catches what the model writes first and refuses the worst of it.

The pipeline, the most elaborate piece, added the least: four-tenths of a point. The adversarial critic and the revision loop, on a 700-word opening, barely beat one disciplined pass through the doctrine. This is not a verdict against the pipeline. Its value shows up at scale, holding twenty sections consistent, running research wide, catching the error a single pass misses across length. On a short opening there is not enough length for that value to appear. The lesson is narrower and more useful: match the machinery to the job, because the expensive stage earns its cost on long jobs and wastes it on short ones.

You cannot subtract the model. It is the floor and the ceiling at once, the source of all the fluency and structure and recall in every version. Nothing I added created a new capability. The prompt, the doctrine, the pipeline all redirected a capability that was already there. The scaffolding is a steering layer on a fixed engine, and the steering matters most exactly where the engine is weakest, which turned out to be honesty and restraint.

So the repo is a memory. Every time the model's confident default turned out wrong, the correction got written down as a constraint, and the constraints accumulated into a voice. This is the same move good writing makes on itself: notice where the convenient sentence lies, and refuse it. The doctrine works because it encodes, in advance, the thousand small refusals a careful writer performs in the...

model doctrine prompt graph pipeline good

Related Articles