What a Regex Can't Do: A Bayesian Governor for OpenClaw's Tool Calls

What a Regex Can't Do – Guy FreemanSkip to contentIn the last post I built a governance layer for a coding agent’s tool calls: a body that hooks the agent’s tool_call event, extracts a few features, and dispatches ask, proceed, or block; and a brain, a Julia daemon that holds a belief and maximises expected utility. The commitment that held it together was that the brain is opaque to the body. The wire carries observations and named actions and nothing else, so the brain can change how it reasons without the body ever knowing.This is the cash-out of that commitment, and then an argument I care about more than the engineering. (If you just want to install the thing, the short version is here.)The cash-out first. The Pass-1 brain promised that the next pass would replace the global Beta with a structure-learning posterior over the features. What shipped is a structure-BMA: a posterior that learns which features matter and how they interact by averaging over the possible dependency graphs, rather than committing to one. The role matches the promise, and the discipline paid in both directions: the brain swapped its posterior without the body knowing, and the body itself moved house — Pass 1’s body was an extension for pi, the agent I was using then; this pass reships it as an OpenClaw plugin — without the brain noticing. The wire schema never moved.Then the argument. The standing objection to all of this — I have put it to myself more often than anyone has put it to me — is that you did not need Bayesian decision theory for any of it, and a regex would do. What follows is the most honest answer I can give, which includes conceding the large part of it that is correct.The brain learned to see<br>The Pass-1 brain had one number: P(approve), a single Beta updated by every yes and no. It could learn that the agent’s calls are generally fine or generally not. It could not learn that a repeated call is waste while a novel call of the same tool is fine, because one global number cannot hold a different belief per context.Pass 2 conditions on context. Give it the tool, the working directory, whether this exact call has been seen before this session, and it learns P(approve | context), with the structure of that conditioning itself inferred from the data. A re-run of a build command and a first read of a new file are now different cells with different beliefs.I wanted to know whether this catches waste on real usage rather than on a demo I had built to be caught. So I replayed thousands of real frontier-model sessions — the public OpenClaw trajectory corpora — through the actual daemon brain, train and test split, posterior frozen before the test arm. The first result was a negative one, and the gate that produced it was the most useful thing I did all month. With the obvious features — tool, parent, repetition-bucket — the brain caught nothing. Those features capture tool-level repetition; real waste is argument-level, the same call run again, which a repetition-bucket cannot isolate. An earlier number that had looked good turned out to be a corpus artefact.The fix was a feature that measures the thing itself: has this exact call run already this session. With it the brain blocks the repeated-identical-call loops on held-out sessions at precision 1.0 and recall 1.0, blocking 0.7% of calls. A static “block all repeats” rule reaches comparable recall only by blocking three-quarters of everything.I will be exact about what 1.0 and 1.0 mean and do not. They are measured against the exact-repeat definition of waste. The right feature made the task learnable, which was the point, rather more than it uncovered something subtle; and whether blocking every re-execution is the correct policy is a question only live data settles. The detection is real, and it generalises across held-out real-model sessions. That is all it is.The part that did not need the machinery<br>Here the objection lands, and it lands correctly. Detecting that an exact call has run before is what a hash set is for. The model averaging, the structure learning, the expected-utility maximisation — none of it is necessary to catch an exact-repeat loop. If waste detection were the whole product, the objection would win outright, and a decision-theoretic brain to match Set.has() would be a cannon levelled at a fly.That bothered me enough to change what the project is about. The agent’s tool calls do not only waste money; some of them are unsafe, and most of them are in service of a task that has real value to me. Waste is one term in my utility, not the whole of it. The brain should be maximising my expected utility — task value, less risk, less cost — and not policing a single degenerate failure mode. So I went looking for the terms where the machinery is not a cannon for a fly.Safety: the ingredient that discriminates<br>Safety is where the choice of feature turns out to be everything, and where a regex’s ceiling is low.I...

What a Regex Can't Do: A Bayesian Governor for OpenClaw's Tool Calls

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

Apple WWDC 2026 Livestream

Claude Fable 5

It's Not Just X. It's Y

Show HN: GoPeek – open links in live mini browser windows without new tabs