Why MCP Is the Wrong Abstraction for Memory | Tenure Mobile VSCode VSCodium Teams Overview Shared Memory AI Governance EU AI Act Compliance Docs GitHub Writing Benchmark Paper Install Free<br>Writing › MCP and memory<br>Architecture Why MCP is the wrong abstraction for memory
MCP treats memory as a tool call. That's the wrong model.<br>Memory isn't something you invoke. It's something that should already be there.
Tenure research · ~8 min read
TL;DR<br>MCP is a tool protocol. Memory is not a tool.<br>Tool calls are explicit, discrete, and optional. Memory is implicit, continuous, and always relevant.<br>When memory requires an invocation, it becomes a second-class input; loaded too late, shaped by the wrong query, and absent when the model doesn't know to ask.<br>The correct layer for memory is the proxy: present before the first token, invisible to both the model and the user, and structurally separate from the AI client.
The premise MCP is a good protocol for the wrong job
The Model Context Protocol is well-designed for what it was built to do: give a model structured,<br>on-demand access to external systems. Fetch a file. Query a database. Create a calendar event.<br>These are discrete actions that should happen when explicitly requested, and MCP handles them well.
Memory is not that. Memory is not a resource you fetch on request. It is context that should<br>be present before the conversation starts, without the model or the user having to think about it.<br>When a developer tells their AI assistant that the auth service uses Redis for sessions, they are not<br>creating a tool result. They are establishing a belief that should shape every future response<br>about that system, across every future session, without any explicit retrieval step.
The distinction matters more than it appears. A tool call is optional input. Memory is<br>structural context. Treating one as the other produces a system that technically works but<br>behaviorally fails in ways that are hard to articulate and harder to fix.
A tool call is optional input. Memory is structural context.<br>The difference is not cosmetic. It determines when the information arrives,<br>how it is weighted, and whether the model knows to use it.
The retrieval problem Tool-based memory retrieves at the wrong time
When memory is implemented as an MCP tool, retrieval happens inside the conversation turn.<br>The model receives a user message, decides (or is instructed) to call the memory tool,<br>gets results back, and then formulates a response incorporating those results.
This creates three concrete failure modes that do not exist when memory is injected at the proxy layer.
01<br>Late arrival
The model has already begun processing the user's message before memory is consulted.<br>In practice, the system prompt and user message establish a reasoning trajectory.<br>Memory retrieved mid-turn is appended context, not foundational context.<br>It does not shape the initial interpretation of the query.
02<br>Query dependency
The model must construct a query to retrieve relevant memory. That query is derived<br>from the current message, which means memory retrieval is only as good as the<br>model's ability to predict what context it needs before it has that context.<br>This is circular in ways that matter at the edges.
03<br>Discretionary skipping
Tool calls are optional. A model that has been given a memory tool will call it<br>inconsistently. Faster responses, shorter system prompts, ambiguous instructions,<br>and temperature variation all affect whether the tool is invoked. Memory that is<br>sometimes consulted is not memory the system can be said to have.
The architecture argument Memory belongs at the proxy layer, not the tool layer
An AI client sends a request to a provider. That request contains a system prompt and<br>a conversation history. Before any of that reaches the model, it passes through the network.
A proxy that sits in that path can intercept the outbound request, extract beliefs from<br>the conversation history, retrieve relevant context from a local store, and inject that<br>context into the system prompt before the request continues. From the model's perspective,<br>the context was always there. From the user's perspective, nothing happened.<br>There was no tool call. No retrieval step. No prompt modification visible to either party.
This is not a minor architectural variation on MCP-based memory. It is a different model<br>of what memory is. The proxy treats memory as ambient context. MCP treats memory as a<br>resource. These produce different systems with different failure modes, different<br>precision properties, and different relationships to the conversation.
Without a memory proxy<br>User message<br>Model<br>MCP memory tool call<br>Model resumes
Memory arrives after initial processing. Query is model-generated. Invocation is discretionary.<br>With a memory proxy<br>User message<br>Proxy injects context<br>Model sees enriched prompt
Memory is structural. Arrives before first token. No tool call. No discretion.
The precision argument Retrieval precision...