Build A Harness — Visual Canvas for AI Agent Harnesses
GitHub
OPEN SOURCE · VISUAL CANVAS · APACHE 2.0
Design, test, and deploy
AI agent harnesses.
Pick the nodes your agent needs. Draw the graph. Run on any framework.
A workflow routes prompts. A harness governs what your agent believes, what it's allowed to do, and how it recovers. Build one on a canvas, compile to any framework, trace every decision with Langfuse, and deploy as a REST endpoint, MCP tool, or A2A agent.
LangGraph<br>CrewAI<br>Mastra<br>MS Agent Framework
Star on GitHub★ 0
See what you can build →
Diagram: a simple agent loop (input → LLM call → tool call → output) compared with a full Build A Harness — an 11-layer architecture across 22 nodes that adds caller state, a world model, reasoning, a 5-tier control layer, planning, execution, 9-layer verification, recovery, memory, optional learning, and an output reviewer pass.
Simple Agent Loop
Input / Caller
LLM Call
Tool Call↺ loop
Output
prompt in → answer out<br>no world model · no control state · no verification
vs
Full Harness — Implemented
Caller Stateconstraints · clarification · propagation
World Modelbeliefs · contradictions · generation_id
Reasoningevidence · hypotheses (4 sources) · VOI
Control5-tier resolver · deadlock detectkey
Planningtask graph (6-state) · parallel concurrency
ExecutionVOI · review gate
Verification9 layers
Recovery6 strategies
Memorycompression · journal
Learningexperience store · warm start (optional)
Output & Reviewer Passcontract · 3-lens review
22 nodes · 11 layers · world model + 5-tier control · 9-layer verification
WHY A HARNESS, NOT JUST A WORKFLOW
A harness does what a<br>workflow can't.
A workflow routes prompts from node to node. A harness governs what the AI believes, what it's allowed to do, how it catches its own mistakes, and what it learns for next time. Use three nodes or eleven — the same FlowSpec runs either.
Reasoning, not just prompting
Add a world_model node and your agent tracks typed beliefs, detects contradictions, and evaluates hypotheses from four generation sources before acting — instead of just asking the LLM and hoping.
Control that holds
Drop in a control_state node and your agent gains a five-tier resolver that governs every action — NORMAL → CAUTIOUS → BLOCKED. Diagnostic health vectors drive it; deadlock detection stops it escalating forever.
Verification with teeth
A verify_gate runs nine checks before every action. Pair it with reviewer_pass for adversarial review and contract validation before every return. Trust, but verify — and actually enforce it.
Recovery and learning
Add recovery for six named strategies, typed failure detection, and local vs global replanning. Attach exp_store and your agent reuses successful decompositions across future runs.
27 node types · 14 execution + 13 harness · 4 framework adapters · Langfuse observability
HARNESS NODE LIBRARY
Every layer you need. Use one or all.
27 node types — 14 execution, 13 harness. Use what your agent needs: a minimal harness might be three nodes, the full stack runs to eleven layers. Both are valid FlowSpec.
Diagram: the same node types compose into different harness graphs — a minimal harness (input → llm_call → verify_gate → recovery → output) and a full 11-layer harness — both valid FlowSpec that compile to LangGraph, CrewAI, Mastra, or Microsoft Agent Framework.
Same bricks. Different graphs. · Compose exactly the harness your agent needs — minimal or full.
Minimal · just verification + recovery
input
llm_call
verify_gate9 layers
recovery6 strategies
output
or
Full Harness · all 11 layers
caller_stateconstraints · escalation
world_modelbeliefs · contradictions
reasoninghypotheses · VOI
control_state5-tier resolver
task_graph6-state planning
execution
verify_gate
recovery
memory
learningoptional
reviewer_pass3-lens · adversarial
outputcontract validated
verify_gate and recovery appear in both graphs — same brick, any config · draw on the canvas · FlowSpec compiles to LangGraph, CrewAI, Mastra, or MAF
Execution layer · 14 node types
✓ Mix and match — any subset of nodes is valid FlowSpec
✓ 4 framework adapters — same spec, any runtime
✓ Langfuse observability — harness traces, all 4 runtimes
✓ HITL pause/resume · REST/MCP/A2A deploy
✓ FlowSpec v0.2.0 — open, portable JSON format
✓ Process concepts — pre-seeded task graph scaffolds
Harness layer · 13 node types
✓ World model · typed beliefs · contradiction detection
✓ 5-tier control state resolver · deadlock detection
✓ Pre-execution review gate · 9-layer verification
✓ 6 named recovery strategies · typed failure library
✓ Experience store — cross-run structural reuse
✓ Adversarial reviewer pass · output contract validation
22 nodes · 11 layers · 379 tests passing
Foundation State Architecture
Evidence & Reasoning
World Model & Contradiction
Diagnostics & Control State
Planning & Task Graph
Execution & Verification
Recovery & Memory
Caller...