From App Factories to a Reasoning Compiler

From App Factories to a Reasoning Compiler — Sergei Belousov

I didn’t plan to build a compiler — I just wanted to maximize out of the AI agents I had.

What is an AI agent today? It’s actually quite simple. There is a language model — the brain and the center of decision making. And there is a harness around the model: the environment where the model works — the thing that makes the model an agent. Without the harness the model is just a text generator, sometimes quite a smart one.

Most of the resources of the labs around the world go into improving the models, which we use as is — and thank god, it’s not us who pay for their training. The harness gets much less attention from the research community. So I have good news for you: the harness is exactly the place where an indie researcher can make a contribution, without having the resources of the frontier labs.

Wishes, Not Guarantees

Today the harness mostly means two things: MCP and Skills. Skills explain to the agent in free text how to approach a task; MCP gives it the tools for that. The idea is quite elegant: give the model a minimal kit — and let it assemble the solution on its own.

But all this elegance rests on one assumption, that the model will follow the received instructions. And a skill is just text in the context window, which the model is not obliged to follow. Skills today are basically prompt engineering on a new lap: they raise the odds, but they give no guarantees. And the problem is not the quality of skills: you can write an excellent instruction, covering all the nuances. But if the model can simply ignore these instructions — it’s a bad foundation for a reliable system. A good foundation needs something else — a structure that the model doesn’t read but executes, with no way around it.

The Splinter

I got here not by theorizing but by hitting walls, project after project: I built an AI mobile developer, taught an agent to actually test things, made a harness repair itself between tasks, and finally moved everything onto a local 27B model, replacing the LLM orchestrator with a finite state machine. I wanted my agents to solve complex tasks autonomously, for many hours, without my intervention. And ideally — to do all of it with lightweight local models that can run right on my laptop. And in every new project, while raising the bar for autonomy and quality, I had to make the harness a bit less advisory and a bit more structural, until it stopped being instructions and became code. Why orchestration with deterministic state machines beats orchestration through reasoning is a separate post; the one-line version: control flow lives in code, and the model is only responsible for judgment at the leaves.

But when the harness of my agents collapsed mostly into code, I was still writing it by hand. For every new task I decided what the topology of the state machine will be, what happens in each state, what contracts between the agents, etc. The essence of this process is reasoning about the task.

An attentive reader will say: but this is how we solved agentic tasks a few years ago, and then we dropped it in favor of orchestration through reasoning, when the models became smart enough — am I trying to sell an old idea? And he will be absolutely right: deterministic orchestration returns control, but it takes away the main achievement of the reasoning era — the model’s ability to derive the solution on its own. To fix this problem, today we will go up to the level of meta-agents and bring the model back into the loop of orchestration decisions. But we will do it in a clever way: we will build an agentic system that spends reasoning once to build a deterministic machine, which then can run as many times as you want, without spending any additional reasoning on orchestration.

This is what reharness does — a reasoning compiler. It is open source, published on npm and installs with one command — at the end of the post I will show how to try everything in a couple of minutes, but first let’s see how it works.

A Trick from 1971

Let me defend my right to use the word “compiler” for reharness. For this I will use two ideas.

First, a compiler takes something expensive to interpret and makes it cheap to execute. A JIT analyzes the program execution, finds the hot paths and compiles them into machine code — so it stops re-interpreting them on every pass. reharness does essentially the same. The hot path is the model’s reasoning, which the agent repeats every time it faces the same class of tasks. The machine code is a finite state machine. Here the analogy works on the level of intuition.

To explain the second idea, we need to recall a rather beautiful trick from the distant 1971 — the first Futamura projection. Let’s start with partial evaluation: if a part of the program’s input is fixed, you can hardcode it inside. For example, for pow(x, n) with fixed n=3, we can do an in-place substitution pow(x, 3) = x * x * x. Now take an interpreter...

From App Factories to a Reasoning Compiler

Related Articles

US Government directive to suspend access to Fable 5 and Mythos 5

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

Apertus – Open Foundation Model for Sovereign AI

Britain Became as Poor as Mississippi