Your agent doesn’t remember your codebase (dupehound) | by Rafael Pinheiro Costa | Jun, 2026 | MediumSitemapOpen in appSign up<br>Sign in
Medium Logo
Get app<br>Write
Search
Sign up<br>Sign in
Your agent doesn’t remember your codebase (dupehound)
Rafael Pinheiro Costa
7 min read·<br>Just now
Listen
Share
Any system that relies on a human to remember what the machine wrote will be limited by the human’s memory, not the machine’s speed.<br>There are two obvious ways to deal with the volume of code that agents produce. Both are bad.<br>The first is to read everything. Every diff, every helper, every test. This is responsible, virtuous, and impossible. A reviewer reads about 500 lines per hour. An agent writes 1,500 lines in ten minutes. You can do the math, and the math does not care about your discipline.<br>The second is to trust the agent and merge. This is fast, exciting, and quietly catastrophic, because the problems that agents introduce are not the kind that explode on Friday. They are the kind that rot.<br>In this post I want to talk about the most common form of that rot: your agent has no memory of your codebase, and you have been acting as that memory yourself.<br>The forgetting problem<br>An agent cannot hold your repository in its context window. It sees the files it was pointed at, plus whatever it found while searching. Everything else does not exist.<br>So when it needs to format a date, it writes formatDate . Three weeks later, in another corner of the repo, it needs to format a date again. It does not remember the first one. It writes renderTimestamp . A month later, stringifyDate .<br>Each copy compiles, passes its tests, and ships. Each copy is now aging independently, and the rounding bug you will fix next quarter will be fixed in one of them.<br>Press enter or click to view image in full size
The agent writes, the codebase grows, and the only deduplication mechanism in the loop is your memory of what the repo contains.This is not a hypothetical. Analyses of millions of commits report that code duplication roughly doubled since AI assistants went mainstream, while refactoring collapsed. The agent is not being lazy. It is being exactly as good as a brilliant contractor who starts every single day with amnesia.<br>Notice what the failure mode does to the humans in the loop. When a PR adds calculateOrderAmount , the only way to know that computeInvoiceTotal already exists is to remember it exists. The reviewer becomes a lookup table. You are not reviewing design anymore; you are doing recall against a 300k-line corpus, which is a job description for a machine.<br>My maxim for this one: any system that relies on a human to remember what the machine wrote will be limited by the human’s memory, not by the machine’s speed.<br>What a memory for code actually needs<br>The naive fix is to ask another LLM to watch for duplicates. I tried variations of this and they all fail in the same three ways.<br>First, exhaustiveness. Finding duplicates means comparing every function against every other function. In a large repo that is effectively billions of comparisons. A model samples whatever fits in context and gives you confident answers about the part it read. An index checks everything, every time.<br>Second, determinism. If the mechanism is going to block merges, the verdict has to be reproducible: same input, same answer, an algorithm someone can read when they disagree with it. You cannot reject a teammate’s PR on a model’s vibe, and you definitely cannot let the verdict change between reruns.<br>Third, cost. This check has to run on every commit, which means it has to be free and take seconds. An LLM pass over the whole repo, per commit, is neither.<br>So the memory has to be an index, not a model. But a naive index of source text is useless here, because the agent does not produce textual copies. It produces the same logic with different names. formatDate and renderTimestamp share almost no tokens and almost all structure.<br>The trick is to fingerprint structure instead of text, and it turns out academia solved this in 2003, for a different adversary: students renaming variables before submitting copied homework. Stanford’s MOSS plagiarism detector is built on an algorithm called winnowing (Schleimer, Wilkerson & Aiken, SIGMOD 2003), and it transfers to our problem almost untouched.<br>An agent renaming identifiers is just a very fast student.<br>Building the memory<br>I packaged this as dupehound, a single-binary CLI in Rust.<br>The pipeline has four stages, and each one earns its place.<br>Press enter or click to view image in full size
The pipeline: discover files, fingerprint every function, match through an inverted index, report.1. Compare function bodies, not files. Every file goes through tree-sitter, and the unit of comparison is the function body. Imports, signatures, and license headers can never participate in a match, which kills the most embarrassing class of false positives before it exists.<br>2. Normalize away what renames change. Identifiers become...