Solving the hallucination problem in agents – with loops and math

Solving the hallucination problem in agents - with loops and math!

Kaspar von Grünberg

SubscribeSign in

Solving the hallucination problem in agents - with loops and math! What mathematics tells us about loop design in agents

Kaspar von Grünberg May 19, 2026

The #1 reason I hear from AI sceptics about why agent-first will not work in the enterprise is that models still hallucinate and “only predict the best next word.” Which is correct but also means nothing! Well, at least it doesn’t mean you cannot solve for that problem. In this article I argue that whether human or math, finding the optimal solution means iteratively looping between attempt and failure until you find a stable optimum. First of all, humans hallucinate too. It’s not that we take all the information available in and come up with the perfect response every time. We wouldn’t even assume this. Instead: we iterate. We iterate by playing a ping pong of hypothesis, testing, analysing errors, talking to a colleague for a second opinion and then trying again. We also deliberately design test frameworks to mirror that and we spend a decent amount of time reasoning through the optimal shape of the output. So if humans work this way, surely we wouldn’t assume agents are suddenly different? Thanks for reading! Subscribe for free to receive new posts and support my work.

Second, even in mathematics you would not generally assume that the optimal solution to a complex problem appears on the first try. Optimization is usually iterative. You define what “better” means, generate a candidate, evaluate the error, use that signal to adjust the next candidate, and repeat. Sometimes you follow a gradient. Sometimes you sample broadly. Sometimes you compare competing candidates. But the underlying pattern is the same: progress comes from a loop, not from a single prediction. The relevant question for enterprise agents is therefore not whether the first model output is perfect. It is whether the surrounding system can evaluate it, constrain it, correct it, and improve it over successive steps. This is the most underestimated insight in agentic platform design. Agents are not deterministic systems that produce the right answer on the first try. They are probabilistic systems that produce a candidate, get evaluated, adjust, and produce another candidate. The first output is almost never the right one. The tenth might be; the fortieth almost certainly is. And that's not because the loop runs long enough by chance, but because each iteration is informed by what the platform learned from the last one. It’s interesting to see how much people usually great at system design struggle with this, and my feeling is that’s because they are used to designing for deterministic systems and struggle with probabilistic ones. Which is strange because they should be designing systems for humans, which are probabilistic machines in a way too. So the way you design for agents is to design for failure in advance, embrace failure as something helpful that you can use as additional input in the next attempt. Let’s dissect how to do this. Optimizing for failure

Most enterprises think about AI in terms of model quality. Better model, better answer yet that is almost certainly the wrong approach. A leading model without the platform substrate to fail and iterate from that failure is worse than a mediocre model with a well-structured failure loop.

Look at the chart above: each red dot is an agent attempt. The first attempt lands low because a validation gate rejects it. That rejection routes back to the agent as feedback, and the agent tries again. Attempts two through six all fail to clear the threshold, but each one is informed by what came before. Attempt seven gets close. Attempt eight finally crosses the gate, and the output is accepted. Notice what happens in the middle of the curve. Attempts four and six look like local maxima just because they are better than what came before. A naive system would accept them and move on. But the gate threshold is set higher, because the platform knows that a local maximum is not a global maximum. The loop continues until the system genuinely converges on quality, not just on improvement. In order to pull this off we need to get three things right: Defining “done”: a crystal clear definition of what “good enough” means in terms of output. Evaluating “done”: validation gates that check where we currently stand against that “good enough” output which is essentially a labelling function, and the optimization function the loop runs against every iteration. Routing efficiently: routing within the platform to have all of those things flow through in sequence as a genuine loop, not a gate the agent walks through once. Defining “done”

The first thing to get right when going “agentic” is to actually jot down what “done” means. Ask ten engineers what a “good PR” looks like and you will get ten different answers. Basic tests...

Solving the hallucination problem in agents – with loops and math

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

Apple WWDC 2026 Livestream

Claude Fable 5

US Government directive to suspend access to Fable 5 and Mythos 5

It's Not Just X. It's Y