Show HN: LoopFlow – plain-English loops that run coding agents until tests pass

LoopFlow — the tutorial

First 5 minutes — in any repo (Node 18+, the Claude Code CLI): npx @loop-lang/loop init # /loopflow skill + AGENTS.md + the loop-first default # then just ask — no slash needed for loop-shaped work: fix the flaky checkout test — keep at it until the suite passes reliably init teaches Claude to reach for a .loop on its own when the work is repeatable + verifiable — and to skip the ceremony for one-off edits. /loopflow is the explicit override. Watch it plan → act → observe, reflect on a red test, and stop only when the check is green. Skeptical? → Why not just prompt? · Full setup in Getting started.

What is a loop basic

AI writes the code now — but you are still the conductor. Every coding task is really five decisions :

DecisionIn a .loopQuestion it answers Objectivegoal:What are we trying to do? Contextlook at:What may the agent read first? Actionsallow… / ask me before…What may it do, and what needs a human? Verificationdone whenHow do we know it worked? Stoppingwhen… / after N triesWhen do we stop — done, or thrashing?

These five are the core syntax — the engine. Everything else in the language either composes loops into bigger workflows (pipeline, flow, for each) or configures a run (the rest of the syntax) — you'll meet those after the core.

Here are all five decisions as one real loop — every line is one of the rows above:

loop "fix the failing test": # the work goal: the cart total is correct with a coupon # Objective look at: the checkout code, and the last failure # Context allow edits automatically, ask me before pushes # Actions done when "pnpm test cart/coupon" passes # Verification after 6 tries: stop and warn "stuck" # Stopping

✓ pass

planeach cycle

actmake the change

observerun done-when

stopgoal met ✓

when it fails → reflect, then plan again

Every cycle runs plan → act → observe . The done when check decides: pass → stop ; fail → reflect , which feeds the error into the next plan. A thrash guard (after N tries ) stops it if it gets stuck — so a loop never spins forever.

The two ideas to keep

Edit the loop, not the prompt. The control structure is the artifact.

You can't fake done. done when runs a real command — a test, a scanner, a script. The loop stops only when the world agrees.

Anatomy — write it in this order the standard

The parser doesn't care what order the lines come in. You should. The standard is one sentence: the finish line first, the safety net last. Four zones, top to bottom — and the file ends up reading in the same order a run degrades: promises at the top, failure handling at the bottom.

The contract — goal:, then done when immediately under it. Write the check before any behavior — this is loop engineering's TDD. If you can't write the check, you don't know what you're building yet; that's the signal to stop and think, not to keep typing. Everything below exists to make this line pass.

The boundaries — look at: (what it may read), then allow …, ask me before … (what it may do alone), then any human gate. Scope before power. Decide gates now, while you're thinking about risk — not later, while you're thinking about behavior. Capability lines (use skills, remember in) live here too.

The engine — each cycle:. Usually the default plan, then act, then observe, written out so a reader sees the shape. also: finishing passes join this zone.

The safety net — when it fails: (recovery), when blocked: (the escape hatch), after N tries: (the floor) — in escalation order. The last line of a loop is its hard stop : the file literally ends with the guarantee that it can't spin forever.

Why done when second and not last? Because it shapes every other line — the context you scope, the actions you allow, the try ceiling are all sized to the check. Written last, done when describes whatever the loop happened to do; written first, the loop is built to satisfy it. Same reason tests-before-code works.

A healthy loop

A finish line a machine can check — or an explicit a human reviews before stopping. Never neither.

Scoped context: the three files that matter + the last failure — not "the repo".

Gates on risk, autonomy on the rest. Gating trivia trains you to rubber-stamp.

Recovery and a floor: reflect, then plan again always paired with after N tries.

Smells: no done when · unbounded look at · a back-edge with no ceiling · a warn message that won't tell future-you what got stuck.

The same order applies inside every stage of a pipeline — each stage is a loop and reads as one. Full reference tables in the manual.

Prompt vs LoopFlow — why not just prompt? why

You could just say "fix the bug." So why write a loop? A prompt fires once and trusts the model's word that it's done. A loop verifies , self-corrects , and stops only when the work is provably finished.

Just promptingA loop "Done" meansthe model says "done"a real command passes — done when "…" On failureyou notice, re-prompt, repeatreflects on the failure, re-plans...

Show HN: LoopFlow – plain-English loops that run coding agents until tests pass

Related Articles

(no title)

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

ZCode – Harness for GLM-5.2

Apertus – Open Foundation Model for Sovereign AI