AI Agent Craves Curation. Here's the Fademem Memory Architecture

Your AI Agent Craves Curation. Here’s the FADEMEM Memory Architecture That Delivers It. | by Vektor Memory | Jun, 2026 | MediumSitemapOpen in appSign up Sign in

Medium Logo

Get app Write

Your AI Agent Craves Curation. Here’s the FADEMEM Memory Architecture That Delivers It.

Vektor Memory

8 min read· 9 hours ago

Listen

You have explained your tech stack to your coding agent four times this month. You mentioned your preferred approach to a problem in January, and your agent has no idea it ever happened. Press enter or click to view image in full size

You corrected a decision last week and the old version is still surfacing. You set up context at the start of every session because there is nowhere for it to go at the end. This is not a model problem, as GPT-4, Claude, and Gemini all have the same limitations. The model is stateless. They all have inbuilt memory, and still every session starts from zero unless you have the infrastructure to persist what matters and surface it at the right moment. That sophisticated memory infrastructure is what most developers do not have. VEKTOR Slipstream v1.6.3 is a local-first memory SDK for AI agents. This release adds the layer most memory systems skip: not just storing what you tell it, but managing what should still be there months later: curation.

What you actually get Before the architecture: What changes for you as a developer embedding this SDK. Every AI memory system forces decisions you didn’t realise you were making. Where does your agent’s context actually lives, is it on your machine or on someone else’s server? Are you paying per token every time your agent understands a memory, or does that happen locally? When you connect your GitHub, your calendar, your files — where does all that data go, and who can see it? Most memory systems answer all four questions for you, quietly, in their terms of service. VEKTOR’s answer to all four is the same: your machine, your data, your rules. Memory lives in a single SQLite file you own. Embeddings run locally on CPU — no API calls, no per-token cost, no data leaving the process. MCP connectors spawn as local stdio processes; nothing is routed through an external service. There is no telemetry, no cloud sync, no account required. If you want to understand exactly what your agent knows about you, you open the database with any SQLite browser and read it. That is what local-first actually means. Your agent stops asking you to repeat yourself. Decisions, preferences, project context, and personal facts persist across sessions and surface when relevant without being re-explained. A context you registered in January is still there in June — if it is still relevant. If it is not, it has faded and stopped competing with what is actually current. Your agent stops surfacing contradictions. When you update a fact, the old version does not linger as an equally valid memory. The conflict resolver determines which one wins based on source trust and recency, and the loser is quietly retired rather than deleted — preserved for audit but excluded from recall. Your agent’s memory stays a manageable size. Without active management, memory graphs grow indefinitely. Every new project adds nodes that never leave. v1.6.3 introduces per-source budgets, automatic decay, and cold storage, so the graph reflects what is currently relevant rather than everything that has ever been stored. You do not need a cloud backend. One SQLite file. Runs on a laptop. No API calls to a cloud host memory service, no extra costs for connectors. No data leaving your machine.

The architecture: what is new in v1.6.3 Decay: memory that fades when it should The new vektor-decay.js implementation uses the FadeMem architecture from a February 2026 paper https://arxiv.org/abs/2601.18642 by researchers at Alibaba and Peking University. To our current knowledge, at this time VEKTOR is one of the first production SDK implementations of this research. The core idea: memories age differently depending on whether you use them. Every memory is classified as Long-term Memory Layer (high importance, frequently recalled) or Short-term Memory Layer (lower importance, infrequently accessed). LML memories decay slowly—roughly an 11-day half-life at default settings. SML memories decay four times faster.

What drives the tier assignment is not just what you set when you stored it. Importance recalculates as a weighted function of semantic relevance to your current goals, access frequency, and position in the causal graph. A memory you actually revisit weekly climbs. One you flagged as important and never touched again gradually drifts down. The FadeMem paper reports 45% storage reduction versus append-only systems at equivalent recall quality. Their ablation shows that removing the dual-layer architecture alone drops multi-hop reasoning F1 by 33.9%. Conflict resolution removal drops it by 22.4%. These are the components now live in VEKTOR’s REM...

AI Agent Craves Curation. Here's the Fademem Memory Architecture

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

It's Not Just X. It's Y

Amazon, Facebook, FBI have access to a private intelligence-sharing network

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy