Bigger context windows are the wrong abstraction for coding agents

Bigger context windows are the wrong abstraction for coding agents · Sigilix

Large context windows are useful. I use them. They let a model hold more files, more logs, more discussion, and more intermediate state before it has to summarize or forget. But for coding agents, context size is often mistaken for continuity. Continuity is different. Continuity means the system knows what happened before this prompt. It knows which findings were real, which ones were dismissed, which conventions the team corrected, which files tend to move together, which dead ends should not be explored again, and which assumptions have already been proven wrong. A larger window can carry more text. It does not decide what is worth remembering. It does not know what to trust. It does not turn a correction from last week into a constraint on today's review. Context is not memory. Most coding agents are context-native or tool-native. A context-native agent works by packing the right files into the prompt. A tool-native agent can search, grep, inspect symbols, and call external systems. Both are important, but both still tend to treat every task like a fresh investigation. Sigilix is pushing toward a different shape: memory-native coding agents. In that model, the agent does not merely fetch context when asked. It works against a persistent repo backing layer that is updated by reviews, dismissals, comments, fixes, issue triage, and agent sessions. Every interaction can leave behind evidence that future interactions can use. That does not mean keeping everything forever. Most interaction data is noise after a few days. The useful parts are decisions, corrections, task state, conventions, dependency relationships, and proof. The memory layer has to be selective, or it becomes another pile of context to drown in. Diagram 01 More context carries text. Memory carries constraints.

Context-window agent diff nearby files search hits chat history

Model reconstructs the repo It must infer relevance, trust, conventions, and prior corrections from raw context in the current run.

Memory-native agent code graph trust ledger team conventions evidence receipts

Model reasons over prepared substrate Prior decisions constrain the task before the model starts spending tokens on the current problem.

The difference is not the amount of text the model can see. It is whether prior decisions, corrections, and proof become part of the next task.Retrieval is not enough either. The obvious objection is that good retrieval should solve this. Index the repo, build a graph, find the relevant files, and put them in front of the model. That is a real improvement over pasting a diff into a chat window. It is also not the same thing as memory. Retrieval answers the question: what text might be relevant right now? Memory answers a different question: what has this repo already taught us that should constrain the answer? Those are not interchangeable. A search result can show the model the current implementation. It will not, by itself, tell the model that this team already rejected a proposed pattern three reviews ago, or that a finding which looks suspicious was previously proven to be a false positive, or that a weird local convention exists because production depends on it. This is why a coding agent can have great retrieval and still feel forgetful. It can find the right file and still ask the same question again. It can read the same helper and still propose the same wrong abstraction. It can inspect the same diff and still fail to carry forward the human correction that made the last review useful. The loop changes when memory is native. In a context-first loop, the agent starts with the prompt, gathers files, reasons, and emits an answer. If the answer is wrong, the user corrects it. In many products, that correction is just part of the chat history. It might survive for the session. It might be summarized. It usually does not become a durable constraint on the next agent run. In a memory-native loop, the correction is not just conversation. It is a signal. A dismissal, an accepted fix, a review reply, a merged PR, a triaged issue, and a failed hypothesis can all update the backing layer. The next time an agent touches the same surface, it should not begin from zero. It should inherit the repo's learned shape: what matters, what was already checked, what the team prefers, and which claims need proof before anyone should believe them. That changes the model's job. The model is still reasoning, but it is no longer responsible for reconstructing all of the institutional memory from raw text every time. It can spend its capacity on the current problem because the system around it is carrying the durable parts. Diagram 02 A correction becomes architecture, not chat history.

Work happens PRs, issues, Slack threads, reviews, and agent sessions.

Humans correct it Dismissals, accepted fixes, and replies separate signal from noise.

Proof is attached Durable...

Bigger context windows are the wrong abstraction for coding agents

Related Articles

US Government directive to suspend access to Fable 5 and Mythos 5

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

Apertus – Open Foundation Model for Sovereign AI

How to Earn a Billion Dollars