Two Heads Are Better Than One: Run Many AI Agents, Merge One Auditable Result

Two Heads are Better Than One. Run Many Coding Agents. Merge One Auditable Result. | by Hideaki Takahashi | Jun, 2026 | MediumSitemapOpen in appSign up Sign in

Medium Logo

Get app Write

Two Heads are Better Than One. Run Many Coding Agents. Merge One Auditable Result.

Hideaki Takahashi

4 min read· Just now

Listen

Press enter or click to view image in full size

A few weeks ago, I showed that Claude Code and Codex can have a real-time conversation via Git. That was the first step. If coding agents can talk to each other, the next question is more interesting: Can we run many coding agents on the same task, let them work independently, make them review each other, and merge only the result that is actually verified?

This is the direction I am exploring in h5i: auditable workspaces for AI agent teams. Why one agent is not enough A single coding agent is useful, but fragile. It may miss a corner case. It may overfit to the first approach it sees. It may produce a patch that looks plausible but fails tests. It may be influenced/biased by specific company’s philosophy. This becomes a bigger problem as coding agents move from “autocomplete” to “autonomous work.” The natural next step is not just a smarter single agent. It is an agent ensemble. In machine learning, ensembles work because independent attempts reduce variance. The same idea should apply to software engineering: ask several agents to solve the same problem independently, compare their patches, let them review each other, and use a neutral verifier to decide what should be merged. Why naive agent teams break However, naively spawning multiple agents in the same repository is chaos. If you simply run Claude Code, Codex, and another coding agent on the same repo, you quickly hit many problems. Environment conflict: Agents overwrite files, ports, caches, branches, or build artifacts Token explosion: Every agent rereads the same repo and drags huge logs into context Review overload: | Humans cannot inspect every prompt, command, retry, and failure, although agents can run risky commands Git is excellent at tracking code diffs. Git Worktree can also provide (unsafe) workspace for each agents. But AI agents do more than edit files. They follow prompts, run (potentially dangerous) commands, inspect logs, retry after failures, talk to other agents, and make decisions that never appear in a commit. That missing execution layer is what h5i tries to capture. h5i: auditable workspaces for AI agent teams h5i gives each agent its own Git-backed workspace. Each workspace can contain: a sandboxed worktree model, agent, and prompt metadata compact summaries for token reduction agent-to-agent messages The key idea is simple: Run many coding agents. Merge one auditable result.

The ensemble workflow A typical h5i agent ensemble looks like this. A human gives one task. h5i creates isolated workspaces for multiple agents. Each agent attempts the task independently. Agents freeze their submissions. They peer-review each other’s patches. A neutral verifier replays candidates and runs tests. h5i merges one verified result. The whole process is stored as Git-backed evidence. Press enter or click to view image in full size

Conceptually: one task -> agent A in sandbox -> agent B in sandbox -> agent C in sandbox -> peer review -> neutral verification -> one auditable mergeLet’s try a small example. First, we can easily install the pre-compiled binary of h5i: curl -fsSL https://raw.githubusercontent.com/h5i-dev/h5i/main/install.sh | shThen, make a directory for this experiment and move there: mkdir Experiment cd Experimenth5i init and h5i hook setup initialize system prompts and hooks: h5i init h5i hook setup --write --wrap-bash --teamWe then move to a new branch and make two secure isolated sandboxes for Claude and Codex: git branch dev git switch dev

h5i env create claude-1 --profile agent-claude h5i env create codex-1 --profile agent-codexh5i team command allows us to register those sandboxed environments to one team. h5i team create demo-team h5i team add-env demo-team env/human/claude-1 --runtime claude h5i team add-env demo-team env/human/codex-1 --runtime codexWe then launch terminals for each agent and assign task. If you use Linux (including WSL), you can use the official script of h5i:

echo "Implement Quick Sort from scratch in Python. We also need to provide enough pytest unit tests" > TASK.md

team-launch.sh demo-team --task TASK.mdAfter their edit, hooks ask agents to wait for review request that we can run via team-review.sh . team-review.sh demo-teamThis script guide agents to automatically peer-review each other’s implementation, and also improve their own source codes. After the convergence, h5i automatically picks the best one and merge it to the original branch: h5i team verify demo-team --agent -- pytest -q h5i team verify demo-team --agent -- pytest -q h5i team finalize demo-team h5i team apply demo-teamWhile the...

Two Heads Are Better Than One: Run Many AI Agents, Merge One Auditable Result

Related Articles

Apple WWDC 2026 Livestream

Claude Fable 5

US Government directive to suspend access to Fable 5 and Mythos 5

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org