Claude Code v2.1.172: Sub-Agents Can Now Spawn Sub-Agents

sscaryterry1 pts0 comments

Claude Code v2.1.172: Sub-Agents Can Now Spawn Sub-Agents | byteiota

Share

Claude Code v2.1.172 enables sub-agents to spawn sub-agents up to 5 levels deep

By ByteBot

Share

For two years, Anthropic enforced one rule in Claude Code without exception: a sub-agent cannot spawn another sub-agent. Version 2.1.172, released June 10, quietly ended that. The changelog entry is a single line. The implications for how you architect agentic workflows are not.

What Changed in v2.1.172

Sub-agents can now launch their own sub-agents, nesting up to five levels deep. The five-level cap is hard — enforced server-side, with no setting to raise, lower, or disable it. Boris Cherny, who leads Claude Code at Anthropic, described the motivation plainly: "agents kicking off agents as a way to better manage context."

That framing is worth holding onto. The feature is not about running more work in parallel. It is about running the noisy work somewhere the main conversation cannot see it. Before this release, a sub-agent handling log analysis would flood its parent’s context with thousands of raw log tokens. The parent then burned additional tokens re-grounding itself before continuing useful work. Nested sub-agents solve that by isolating the log-reader in its own context frame and returning only a summary upward.

The 5-Frame Stack: A Mental Model That Holds

Think of nested sub-agents as call-stack recursion with a five-frame depth limit. Each frame carries its own system prompt, its own model selection, and its own 200K token context window. The parent reads only what the leaf returns. Everything in between — the searches, the file reads, the intermediate reasoning — costs tokens and then disappears from the parent’s view.

A practical debugging chain looks like this:

main session (Opus) → triage-lead (Opus) → repro-runner (Sonnet) → log-summariser (Haiku)

Layer 1 loads the incident. Layer 2 fans out across four candidate causes. The agent investigating log correlations — a noisy, token-heavy task — spawns a Layer 3 Haiku agent to do the grep work and return a one-line result. The main thread sees four clean verdicts, not 50,000 tokens of raw log output. That is the use case: not more agents, but cleaner results from the agents you already have.

Most useful chains live at depth 2–3. Five levels is the ceiling, not the target.

Token Math: Read This Before You Nest Anything

Nesting multiplies token consumption at roughly 7× per branch per level. That is not a rough estimate — it is the observed overhead from agent orchestration, context initialization, and summary-passing between frames. It compounds fast.

A developer on the community forums described hitting 887,000 tokens per minute before noticing. A financial services team ran a "simple" code quality project with 23 sub-agents and received a $47,000 invoice three days later. The five-level depth cap prevents infinite loops. It does not prevent your billing alert from triggering at 3 AM.

Set a spend limit before nesting anything in production. Claude Code’s billing settings support per-session caps. Use them.

How to Configure Nested Sub-Agents

Sub-agent definitions live in .claude/agents/.md at the project level or ~/.claude/agents/.md for user scope. The Agent() field in tools is new as of this release — it is the allowlist controlling which sub-agent types this agent is permitted to spawn.

name: triage-lead<br>model: claude-opus-4-6<br>tools:<br>- Read<br>- Bash<br>- Agent(repro-runner, log-summariser)<br>You are a debugging triage lead. Load the incident, delegate to repro-runner<br>for reproduction, delegate to log-summariser for log correlation, and return<br>a single verdict: confirmed, unconfirmed, or needs-more-data.

Model tiering across levels is not optional if cost matters. Run Opus at the orchestration layer, Sonnet for mid-level implementation work, and Haiku for leaf tasks like log reads, grep, and test generation. The official sub-agents documentation covers the full configuration spec. In practice, tiered routing costs roughly $0.98 per session versus $2.02 for uniform Opus — a 51% reduction with no quality loss on leaf tasks.

Three Pitfalls to Avoid

Token explosion via circular spawning. The depth cap prevents infinite nesting, not circular logic. One production case: a triage agent spawned a "general researcher" that spawned an "investigation specialist" that spawned a "log reader" that spawned a general researcher again. Four levels deep, circular, zero useful output, substantial cost. Use explicit Agent() allowlists. If the model cannot complete the task with the specified children, fix the task description — do not widen the allowlist.

Silent failures at depth five. When a Level-5 agent attempts to spawn a Level-6, it receives an error. Without explicit handling, that error propagates silently or surfaces as unexpected output at the top level. Test your nesting depth explicitly with a canary chain before deploying complex hierarchies in production.

Nesting...

agents agent claude level code spawn

Related Articles