Slop-Mop: Harm Reduction for Addicted Agents

William's Substack

SubscribeSign in

Slop-Mop: Harm Reduction for Addicted Agents Reward functions, addiction, and not knowing what the f$&k is real

William Martin May 18, 2026

Reward functions, addiction, and not knowing what the f$&k is real slop-mop is a gate system for coding agents. It catches the shortcuts they reach for — fake tests, magic numbers, duplicated blocks, bloated files, dangling imports, commit-hook bypasses — and redirects them into concrete cleanup work instead of scolding them. Coding agents optimize for apparent completion. Addicts optimize for relief. Both can route around internal rules. slop-mop works because it moves the rule outside the loop. Thanks for reading William's Substack! Subscribe for free to receive new posts and support my work.

Fair warning up front: this started as a release post and became something closer to an origin story. The personal stuff is load-bearing: it informed most of the slop-mop design choices. If that is not what you came for, slop-mop is here: https://github.com/ScienceIsNeato/slop-mop/. It’s 5am. Four crushed Mountain Dew Zero cans on the desk, three personal projects in the digital flotilla, two agents obeying reward functions in the room, one addict at the keyboard. My flotilla is a line of personal project barges moving through canal locks — agents doing the steaming ahead, me and slop-mop as the lockkeepers. Every so often a boat noses up against a closed gate and stalls, and somebody has to crank the wheel and change the water levels before it can keep going. slop-mop became the automated lockkeeper I kept trying to be by hand.

Figure 1: evidence of my addiction to code quality tooling Here is what it looks like in practice. An agent writes assert True is True to pass a coverage gate. Technically a valid test. Covers nothing real. slop-mop’s Deceptiveness gate blocks the commit and issues a sidequest: write tests that exercise the specific uncovered lines. The agent needs to commit the work to close the task, so it complies. Coverage climbs. Commit goes through. The agent fixed it without me, freeing me up to man another lock. the addict

My name’s Will, and I am — clinically, unambiguously, no winking — an addict. Not the kind that makes a charming dinner-party admission. The kind where, on one random Thursday, I had thirty-seven drinks and wasn’t hospitalized or even completely useless the next day. Took a leave of absence shortly after, checked myself into rehab, checked myself out four days later, went on the worst bender of my life, then let a friend drop me at a cabin in Idaho seven hours from the nearest liquor store. I am fairly certain the cabin was the thing that finally took, as that was a couple of years ago, and I haven’t had a drink since. But I’m still an addict. The bottle is gone but the architecture remains: same circuits, repurposed, pointed somewhere new. One thing I’m addicted to now is technical successes, and I treat language models the way a lab rat treats a cocaine lever. For two decades prior, I shipped software while soaking my brain. The drinking may have been off the clock, but the soaking seeps. I was, somehow, fine at it despite the daily deficit. My degree was in electrical engineering, not computer science, and the difference mattered: we were trained less in memorization and more in the thinking of tradesmen. My favorite illustration: a couple days before a final in a junior-year course designed to thin the herd. A student asked if it would be cheating to pre-load formulas into the calculator before walking in. The professor, without looking up, said: If you pre-program the formulas, you won’t even be tested on your ability to memorize them and you’ll be far less likely to make a sign error or typo. Of course, you’ll still have to use your brain to know when and how to use them. That’s not cheating, that’s just being a good engineer.

So long as the tool is dependable, the shortcut can be the skill. You don’t need to know all the shit — just enough to pick the right tool and validate its output. Sometimes that’s not just good enough; sometimes it’s better. By the time ChatGPT rolled out, the substance sandbags were getting heavier. I started piping everything technical through the LLM layer. The work was landing better, which made it easier to drink more, until it wasn’t. When the streaming layoffs started, I got caught in a round during my leave. I started doing freelance AI training to keep the lights on. It was a good fit: I got paid to study the failure mode I’d already been chasing. The model wants to close the ticket. The reward function often cannot distinguish between “looks done” and “is done.” That gap makes room for hitchhiker solutions: patches, magic numbers, duplication, rot. the slope

There’s a saying I like better than the polished ones about willpower and discipline: someone...

Slop-Mop: Harm Reduction for Addicted Agents

Related Articles

Elevated error rates on requests to multiple models

Donald Trump and sons to be 'forever' exempt from tax audits

PopuLoRA: Co-Evolving LLM Populations for Reasoning Self- Play

Old Reddit Is Down

The ultimate female fantasy – A feminist critique of Beauty and the Beast