Loop Engineering: When Generation Gets Cheap, Judgment Gets Expensive

The Technical Executive

SubscribeSign in

Loop Engineering: When Generation Gets Cheap, Judgment Gets Expensive Agentic loops make code, plans, and PRs abundant. The scarce part is knowing what is right.

Stephane Derosiaux Jun 25, 2026

I’m CTO, I prompt a lot. Always a few lines: context, a couple of directions, the goal, the constraints, the data/tools to use. I let Claude/Codex handle the what and the how. I build many POCs, Github projects, and business projects at conduktor.io, engineering, marketing, SEO, support tickets, operations, so I’m spawning agents constantly in parallel, not just on code. I’m good at context switching. One trick I use: I often add “I’m going to sleep”, hit enter, and run away. This way, Claude does not need me and does not interrupt itself.. One day, writing some marketing content, a dynamic workflow (the new Claude thing) drained my entire Max quota in a few minutes. It had spawned hundreds of agents: some went gathering knowledge

some pulled code and state

a whole batch wrote actual programs whose only job was to verify each facet I'd asked about, so the thing could tell me, with proof, whether the code and the claims were 100% correct.

Read again: A few lines in… hundreds of agents out: millions and millions of tokens consumed and hours of work. This is where AI is going: cheap generation, expensive judgment.

I don't prompt anymore

I don’t write step-by-step prompts anymore. I write the root prompt: context, goal, constraints, and how success will be judged. We’ve all been growing with AI: 1/ Prompt engineering : being smart at prompting, that was even a job!

2/ Context engineering : what to retrieve/remove in the context window

3/ Harness engineering : how to build a run (which tools, which actions, what counts as "done").

4/ Loop engineering (2026!) : making it run itself, over and over.

Each stage is expanding on the previous one. The new stage in 2026 (it seems to be called “loop engineering”) is a bit different. Before, it was still human-centric, with the human steering the agent. The loop is about agents steering agents. We’re no longer inside the loop doing the work. We’re outside it, building the thing that does the work: prompting Claude/Codex to prompt itself recursively. That workflow that ate my quota? Not my fault? Kind of. I didn’t write the sub-prompts, yet I did define the root conditions that allowed the loop to run that far. I didn't decide all the steps and their depth, or that the verification agent should write a program, test it in Docker etc. The model did all of that from my five lines. It broke my goal into steps, determined techniques per step, added some checks between them, then ran a top-down pass to fold the results per step and iterate. I literally wrote such an orchestrator a year ago, I was really proud of it, it was amazing to handle the wiring, cross-agent communication, prompt generation, etc. It’s totally useless now. The sub-prompts, the glue, the "ok now take this output and feed it there." That part is gone. What's left is the root of it: the context, the goal, what I actually care about, the constraints (time, budget, thoroughness, exactness). You have to know all of this to maximize your efficiency with AI, not just tell it “what” to do (the task itself), otherwise it means you lack the intel, the context, the “why”, therefore, AI/someone who knows this can just replace you and do the task better as they will have a better judgment.

What is an agentic loop?

Compared to our deterministic for/while loops in our favorite programming language, an agentic loop has a different structure: Discovery : it finds each turn's work on its own (reads CI, open issues, recent commits, support tickets, metrics changing, etc.).

Handoff : it gives the task to a specialized agent in its own isolated workspace

Verification : a different agent says "yes" or “no” to the result (LLM as a judge)

Persistence : it writes state somewhere (outliving the conversation)

Scheduling : it kicks itself off again later.

All these steps are necessary: No discovery ? As humans, we overthink and spend too much time deciding what we should work on. Agents don’t care, they pick what’s available, just give them a space to fetch that.

No persistence? You want cumulative progress. Agents forget everything the moment the context window clears. Memories help avoid rediscovering the same things, redoing the same work, and creating conflicts.

No scheduling? A loop needs a trigger to start

No isolated workspace? Run two agents in parallel against the same working directory and regret it. One worktree per task. Always. (I find Conductor nice)

No verification? Don’t even think about removing this.

Verification is the crux of everything

Ask an agent to score what it just produced and it will praise itself. The context in which the code was written is already stuffed with the...

Loop Engineering: When Generation Gets Cheap, Judgment Gets Expensive

Related Articles

US Government directive to suspend access to Fable 5 and Mythos 5

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

Apertus – Open Foundation Model for Sovereign AI

Britain Became as Poor as Mississippi