agent memory: an anatomy<br>every agent memory library uses the same words: episodic, semantic, sometimes procedural. they’re cognitive science’s vocabulary, lifted into the API. the engineering often isn’t lifted with them. a library can have a procedural field that uses the same storage and retrieval as semantic — a label, not a separate system. the deeper slip is the word memory itself: most of what these libraries build is narrower than that, and the narrower term sharpens the problem.
the terminology comes from a 1972 chapter by Endel Tulving.1 he argued that what people had been treating as one thing — memory — was at least two: memory for events (what happened, where, when), and memory for facts (the capital of France, water’s boiling point). he called them episodic and semantic. they behave differently and they fail differently.
most of what these libraries call “memory” is narrower than the word suggests: not a full cognitive memory system, but autobiographical content about the user held on the user’s behalf — where they live, what they’re working on, what they’ve decided.
the anatomy of an agent memory system
an agent memory library is built from a small number of components. you can read any library’s docs by knowing the parts.
the extractor. the thing that reads conversation transcripts and decides what to keep. usually an LLM call, sometimes with a strict prompt or a typed output schema. it produces statements — short, abstracted facts about the user, the world, or the task.
the most consequential choice an extractor makes is timing. extract eagerly, after every message, and you spend tokens on small talk that goes nowhere. extract lazily, at the end of a session, and the context you needed to resolve a pronoun is already gone. neither timing is wrong; each loses something the other keeps. the question worth asking of any library is what gets thrown away — coreference cues (which “he” refers to which person), temporal anchors (“yesterday,” “next week”), and disambiguating local context are common casualties. extraction is, in cognitive terms, a compression from situated event to decontextualized fact: user mentioned over coffee on Tuesday that they prefer TypeScript becomes user prefers TypeScript. how aggressively a library compresses is one of its central design decisions.
the store. the database. one or more of: a vector index (entries indexed by semantic similarity), a relational table (entries indexed by columns you can filter on), a knowledge graph (entries connected by typed edges). each statement carries metadata — a timestamp, sometimes a confidence score, sometimes a source pointer back to the original conversation.
the hardest question a store answers isn’t where to put things. it’s what to do when a new statement contradicts an old one. the user lived in Paris until April, then moved to Amsterdam — and the store now has both, each presenting as current. the choice is whether to
overwrite (one truth, no history)
append (both, leave it to retrieval to sort out)
keep both with the old marked as superseded.
a store that can’t answer what did I believe last month? isn’t a memory system. it’s a snapshot with a timestamp on it.
the retriever. at query time, this component turns the current question into a search and returns the statements most likely to be relevant. vector similarity is the baseline. keyword search on top of that is common. a reranker is the standard third layer. structurally this is RAG; the corpus is the user’s accumulated statements rather than a document library. some libraries also run a time filter (don’t return statements known to be out of date) and a presupposition check — detect when the question itself assumes a stale fact and block it from being pulled into context.
every difference between agent memory libraries lives in one of these three parts. you can describe any library in terms of them without yet knowing what it’s for.
the kinds of memory
cognitive science’s canonical taxonomy consists of four kinds: episodic, semantic, procedural, and working. working memory in agents is the context window — a different machine from the one this post is about, worth setting aside.2 that leaves three. add prospective — it isn’t in the canonical taxonomy, but it names a gap the field hasn’t filled.
episodic memory. specific events tied to a time and place. I had coffee with Aleksandra last Tuesday at the place on Mostowa. the memory is dated, situated, and personal. you experienced it. recall feels like re-experiencing — you can place yourself back in the scene.
agent memory libraries handle this with a table of timestamped statements. user mentioned they live in Berlin (2026-03-14). each entry is a single event the system observed. some libraries keep the raw conversation episode alongside the extracted facts.
semantic memory. facts about the world that aren’t tied to any specific event. Berlin is the capital of Germany. the boiling point of water is...