Show HN: Build A Harness – open-source, modular harness layer for AI agents

philiparxist1 pts0 comments

Build A Harness — Visual Canvas for AI Agent Harnesses

GitHub

OPEN SOURCE · VISUAL CANVAS · APACHE 2.0

Design, test, and deploy

AI agent harnesses.

Pick the nodes your agent needs. Draw the graph. Run on any framework.

A workflow routes prompts. A harness governs what your agent believes, what it's allowed to do, and how it recovers. Build one on a canvas, compile to any framework, trace every decision with Langfuse, and deploy as a REST endpoint, MCP tool, or A2A agent.

LangGraph<br>CrewAI<br>Mastra<br>MS Agent Framework

Star on GitHub★ 0

See what you can build →

Diagram: a simple agent loop (input → LLM call → tool call → output) compared with a full Build A Harness — an 11-layer architecture across 22 nodes that adds caller state, a world model, reasoning, a 5-tier control layer, planning, execution, 9-layer verification, recovery, memory, optional learning, and an output reviewer pass.

Simple Agent Loop

Input / Caller

LLM Call

Tool Call↺ loop

Output

prompt in → answer out<br>no world model · no control state · no verification

vs

Full Harness — Implemented

Caller Stateconstraints · clarification · propagation

World Modelbeliefs · contradictions · generation_id

Reasoningevidence · hypotheses (4 sources) · VOI

Control5-tier resolver · deadlock detectkey

Planningtask graph (6-state) · parallel concurrency

ExecutionVOI · review gate

Verification9 layers

Recovery6 strategies

Memorycompression · journal

Learningexperience store · warm start (optional)

Output & Reviewer Passcontract · 3-lens review

22 nodes · 11 layers · world model + 5-tier control · 9-layer verification

WHY A HARNESS, NOT JUST A WORKFLOW

A harness does what a<br>workflow can't.

A workflow routes prompts from node to node. A harness governs what the AI believes, what it's allowed to do, how it catches its own mistakes, and what it learns for next time. Use three nodes or eleven — the same FlowSpec runs either.

Reasoning, not just prompting

Add a world_model node and your agent tracks typed beliefs, detects contradictions, and evaluates hypotheses from four generation sources before acting — instead of just asking the LLM and hoping.

Control that holds

Drop in a control_state node and your agent gains a five-tier resolver that governs every action — NORMAL → CAUTIOUS → BLOCKED. Diagnostic health vectors drive it; deadlock detection stops it escalating forever.

Verification with teeth

A verify_gate runs nine checks before every action. Pair it with reviewer_pass for adversarial review and contract validation before every return. Trust, but verify — and actually enforce it.

Recovery and learning

Add recovery for six named strategies, typed failure detection, and local vs global replanning. Attach exp_store and your agent reuses successful decompositions across future runs.

27 node types · 14 execution + 13 harness · 4 framework adapters · Langfuse observability

HARNESS NODE LIBRARY

Every layer you need. Use one or all.

27 node types — 14 execution, 13 harness. Use what your agent needs: a minimal harness might be three nodes, the full stack runs to eleven layers. Both are valid FlowSpec.

Diagram: the same node types compose into different harness graphs — a minimal harness (input → llm_call → verify_gate → recovery → output) and a full 11-layer harness — both valid FlowSpec that compile to LangGraph, CrewAI, Mastra, or Microsoft Agent Framework.

Same bricks. Different graphs. · Compose exactly the harness your agent needs — minimal or full.

Minimal · just verification + recovery

input

llm_call

verify_gate9 layers

recovery6 strategies

output

or

Full Harness · all 11 layers

caller_stateconstraints · escalation

world_modelbeliefs · contradictions

reasoninghypotheses · VOI

control_state5-tier resolver

task_graph6-state planning

execution

verify_gate

recovery

memory

learningoptional

reviewer_pass3-lens · adversarial

outputcontract validated

verify_gate and recovery appear in both graphs — same brick, any config · draw on the canvas · FlowSpec compiles to LangGraph, CrewAI, Mastra, or MAF

Execution layer · 14 node types

✓ Mix and match — any subset of nodes is valid FlowSpec

✓ 4 framework adapters — same spec, any runtime

✓ Langfuse observability — harness traces, all 4 runtimes

✓ HITL pause/resume · REST/MCP/A2A deploy

✓ FlowSpec v0.2.0 — open, portable JSON format

✓ Process concepts — pre-seeded task graph scaffolds

Harness layer · 13 node types

✓ World model · typed beliefs · contradiction detection

✓ 5-tier control state resolver · deadlock detection

✓ Pre-execution review gate · 9-layer verification

✓ 6 named recovery strategies · typed failure library

✓ Experience store — cross-run structural reuse

✓ Adversarial reviewer pass · output contract validation

22 nodes · 11 layers · 379 tests passing

Foundation State Architecture

Evidence & Reasoning

World Model & Contradiction

Diagnostics & Control State

Planning & Task Graph

Execution & Verification

Recovery & Memory

Caller...

harness agent layer node recovery nodes

Related Articles