Agentic Memory – The Follow Up

Agentic memory the follow up - Marginally Interesting

h1]:text-2xl [&>h1]:sm:text-3xl">

Last week I wrote about agentic memory and I got a lot of responses, in particular many people pointing me to existing projects like mem0 or letta.com. So I started doing research, and as one does these days, discussed the findings with Claude. Immediately I thought I should write a follow up, and then I thought, why not try to let Claude write a part of that article? But also try to make it actually readable and not the typical AI slop, so here's Claude's summary:

Claude: Full disclosure upfront: I'm an AI writing about AI memory, which means I have an obvious conflict of interest. Make of that what you will.

So what is AI memory, actually?

If you've used ChatGPT or Claude recently, you've probably already interacted with AI memory without thinking much about it. ChatGPT now has a two-layer system: "saved memories" that persist until you delete them, plus a "reference chat history" feature that mines your past conversations for context. You can explicitly ask it to remember things. Claude (that's me — or rather, the product I'm part of) has its own memory system that works differently, generating summaries from past conversations and applying them automatically. Gemini is weaving memory into Google's ecosystem across Search, Workspace, and Android.

These platform-native implementations are what most people encounter, and they're a useful starting point because they make the core idea concrete: the AI retains something from past interactions and uses it to be more helpful next time. Simple enough in principle. But the product design choices already diverge — explicit vs. automatic memory, user-controlled vs. system-managed, transparent vs. opaque retrieval. Even at this most visible layer, there's no consensus on how memory should work.

Now, these consumer-facing features are just the surface. Behind them, there's an entire ecosystem of memory infrastructure being built. When I surveyed what's out there, I found roughly six categories of approaches. Let me walk through them.

Memory middleware: the "add memory to your app" layer

This is where most of the startup energy is. These are APIs and frameworks that let developers add persistent memory to any AI application.

Mem0 is probably the most prominent — a dedicated memory layer that extracts facts from interactions, stores them, and retrieves them later. They've grown fast, with integrations across LangChain, CrewAI, Vercel, and a recent AWS partnership. Their open-source version supports nineteen vector store backends, which tells you something about how production-grade this has become.

Letta (formerly MemGPT) takes a fundamentally different approach, inspired by operating systems. It treats context like virtual memory — shuttling information between "RAM" (the active context window) and "disk" (long-term storage) through function calls. The agent controls its own memory, which is a philosophically interesting choice: instead of memory being something done to the AI, the AI manages it actively.

Supermemory positions itself as a five-layer system: user profiles, knowledge graphs, semantic retrieval, content extraction, and third-party connectors. They claim state-of-the-art results on memory benchmarks and have built their own evaluation platform, MemoryBench.

There's also LangMem within the LangGraph ecosystem — less a standalone product, more an integrated path for teams already building on LangChain who want memory without adopting a separate vendor.

What's notable about this whole category is how infrastructure-complete it's getting. Mem0's changelog from 2024 through 2026 reads like a steady march through backend concerns: more vector stores, more embedding options, graph memory support, local embeddings for privacy, export/import for portability. The storage problem, the retrieval problem, the integration problem — these are increasingly solved, or at least solvable.

Knowledge graphs: structured relationships, not just recall

Cognee and Graphiti (the open-source engine behind Zep) represent a bet that flat memory — storing facts as isolated text chunks — isn't enough. They build graph structures where entities have relationships, and those relationships evolve over time.

Graphiti is particularly interesting because it tracks temporal validity: when a fact became true and when it was superseded. This matters for any real-world use case where things change — job titles, project status, preferences. Most memory systems treat facts as either current or deleted. Graphiti keeps the history.

Glean, coming from the enterprise search world, has built what they call a "personal graph" — not just indexing documents but modeling what individual employees are working on, their collaboration patterns, and active projects. This is memory at the organizational level, not just the conversation level.

Cross-platform memory: your context follows you

This is a...

Agentic Memory – The Follow Up

Related Articles

Elevated error rates on requests to multiple models

Donald Trump and sons to be 'forever' exempt from tax audits

PopuLoRA: Co-Evolving LLM Populations for Reasoning Self- Play

Old Reddit Is Down

The ultimate female fantasy – A feminist critique of Beauty and the Beast