Designing Memory for zerostack: Plain Files, No Vector Store • Xavier's Data Forge<br>skip to content
Dark Theme 中 Open main menu<br>Designing Memory for zerostack: Plain Files, No Vector Store<br>6 June 2026 / 34 min read<br>–views<br>View more blogs with the tag rust , View more blogs with the tag ai agent , View more blogs with the tag memory , View more blogs with the tag context engineering , View more blogs with the tag open source , View more blogs with the tag zerostack
Table of Contents Why I wrote this
Earlier this May, I was learning Rig with the idea of building a minimal coding agent for some of my own work (something small enough that I could understand every layer of it).
Then zerostack showed up on Hacker News, and I noticed two things on the first read: it’s built on Rig, and its design philosophy was the one I had been trying to articulate to myself, only sharper.
I mean, a coding agent that runs in around 16 MB of RAM, file-based context, sandboxed permissions, no daemon. The thing I was about to build, except already built and probably better.
So I tried to use it for real work. That immediately surfaced a small problem: our internal LLM gateway needs custom HTTP headers, and zerostack at the time had no way to set them. I sent a PR to fix it1, which merged a few days later. Only then did I join the project’s Matrix chatroom (at that point, it was just the maintainer and me) and ask what was coming next.
I had the privilege of an almost one-on-one conversation, and he said a lot of interesting things about where the project was going. The one sentence I latched onto was this:
“Thank you! For now, the focus is just making it work well (v1.3.x) and adding subagents (v1.4.x); if you want to work on LSP or even better Memory support, it’s a good idea!”
My first instinct had been subagents (it was what I most wanted to use myself), but that was already on his plate. That left LSP and Memory.
The “even better Memory support” was a quiet signal. I picked it.
What follows is a design walkthrough for the memory subsystem I shipped, focused on the reasoning behind each decision. Memory is a layer with many viable shapes; the one I landed on here is 797 lines of Rust, after looking at what other agent harnesses do and asking which pieces actually apply to zerostack’s specific constraints.
Quick note on versions<br>This document reflects the memory subsystem as merged in commit 3005eb6 (2026-05-26, a seven-commit series). I deliberately chose conservative values for two constants: MAX_INJECT_BYTES at 16 KB and search context at ±1 line. The reasoning is in §4 and §6. The maintainer has since raised them to 64 KB and ±3, and hardened the write path (atomic writes, size-capped writes, and an FNV-1a slug hash for cross-Rust-version stability). These are tuning knobs; the algorithms and architecture below are unchanged.
Numerical values in the body (cap sizes, context windows, and so on) refer to commit 3005eb6’s configuration unless otherwise stated.
1. The amnesia tax
A coding agent without memory is amnesiac in the most expensive way: every session it re-asks where the project’s auth lives, re-discovers that docker-compose clashes with the host’s Redis port, and re-derives the team’s naming conventions. The cost is paid either by the user re-explaining things or by the model burning tool calls to rediscover them, every time.
The fix is not to make the model remember (it can’t), but to give the harness a place to write things down between sessions and inject the relevant parts back at the start of the next one.
That place needs to be:
Durable across sessions and crashes
Scoped so one project’s working context doesn’t pollute another
Bounded so it can’t silently consume the context window
Recallable mid-session when something needs to be looked up
Honest about its limits , so the model doesn’t act on stale facts
These constraints sound reasonable in the abstract, but several of them pull against each other the moment you try to implement them. Durable storage fights bounded injection: the more you keep, the harder it is to fit only the relevant parts into each session. Scoped isolation and broad recall are also at odds, since project boundaries help right up until the model needs something it filed under a different project. The third tension is between recall and honesty about limits, where surfacing more makes it harder to keep every item correctly labeled with what it is and how stale it might be. The design that follows is a series of decisions about where to bend on each of these tensions.
2. zerostack’s design philosophy as a filter
Before designing memory, the first job is to be precise about what zerostack actually is, because whether a subsystem belongs in the project depends on whether it agrees with the rest of it. zerostack’s identity is built on a small handful of choices that, taken together, decide what can be added and what can’t:
Small . The codebase is roughly five to ten...