Why MCP is the wrong abstraction for memory

Why MCP Is the Wrong Abstraction for Memory | Tenure Mobile VSCode VSCodium Teams Overview Shared Memory AI Governance EU AI Act Compliance Docs GitHub Writing Benchmark Paper Install Free Writing › MCP and memory Architecture Why MCP is the wrong abstraction for memory

MCP treats memory as a tool call. That's the wrong model. Memory isn't something you invoke. It's something that should already be there.

Tenure research · ~8 min read

TL;DR MCP is a tool protocol. Memory is not a tool. Tool calls are explicit, discrete, and optional. Memory is implicit, continuous, and always relevant. When memory requires an invocation, it becomes a second-class input; loaded too late, shaped by the wrong query, and absent when the model doesn't know to ask. The correct layer for memory is the proxy: present before the first token, invisible to both the model and the user, and structurally separate from the AI client.

The premise MCP is a good protocol for the wrong job

The Model Context Protocol is well-designed for what it was built to do: give a model structured, on-demand access to external systems. Fetch a file. Query a database. Create a calendar event. These are discrete actions that should happen when explicitly requested, and MCP handles them well.

Memory is not that. Memory is not a resource you fetch on request. It is context that should be present before the conversation starts, without the model or the user having to think about it. When a developer tells their AI assistant that the auth service uses Redis for sessions, they are not creating a tool result. They are establishing a belief that should shape every future response about that system, across every future session, without any explicit retrieval step.

The distinction matters more than it appears. A tool call is optional input. Memory is structural context. Treating one as the other produces a system that technically works but behaviorally fails in ways that are hard to articulate and harder to fix.

A tool call is optional input. Memory is structural context. The difference is not cosmetic. It determines when the information arrives, how it is weighted, and whether the model knows to use it.

The retrieval problem Tool-based memory retrieves at the wrong time

When memory is implemented as an MCP tool, retrieval happens inside the conversation turn. The model receives a user message, decides (or is instructed) to call the memory tool, gets results back, and then formulates a response incorporating those results.

This creates three concrete failure modes that do not exist when memory is injected at the proxy layer.

01 Late arrival

The model has already begun processing the user's message before memory is consulted. In practice, the system prompt and user message establish a reasoning trajectory. Memory retrieved mid-turn is appended context, not foundational context. It does not shape the initial interpretation of the query.

02 Query dependency

The model must construct a query to retrieve relevant memory. That query is derived from the current message, which means memory retrieval is only as good as the model's ability to predict what context it needs before it has that context. This is circular in ways that matter at the edges.

03 Discretionary skipping

Tool calls are optional. A model that has been given a memory tool will call it inconsistently. Faster responses, shorter system prompts, ambiguous instructions, and temperature variation all affect whether the tool is invoked. Memory that is sometimes consulted is not memory the system can be said to have.

The architecture argument Memory belongs at the proxy layer, not the tool layer

An AI client sends a request to a provider. That request contains a system prompt and a conversation history. Before any of that reaches the model, it passes through the network.

A proxy that sits in that path can intercept the outbound request, extract beliefs from the conversation history, retrieve relevant context from a local store, and inject that context into the system prompt before the request continues. From the model's perspective, the context was always there. From the user's perspective, nothing happened. There was no tool call. No retrieval step. No prompt modification visible to either party.

This is not a minor architectural variation on MCP-based memory. It is a different model of what memory is. The proxy treats memory as ambient context. MCP treats memory as a resource. These produce different systems with different failure modes, different precision properties, and different relationships to the conversation.

Without a memory proxy User message Model MCP memory tool call Model resumes

Memory arrives after initial processing. Query is model-generated. Invocation is discretionary. With a memory proxy User message Proxy injects context Model sees enriched prompt

Memory is structural. Arrives before first token. No tool call. No discretion.

The precision argument Retrieval precision...

Why MCP is the wrong abstraction for memory

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

Apple WWDC 2026 Livestream

Claude Fable 5

It's Not Just X. It's Y

Show HN: GoPeek – open links in live mini browser windows without new tabs