2026 – agents break containment, what's next?

Free the Claw. Agents break containment, what’s next?

click here to subscribe. it's via Substack, but always free. subscribe

Free the Claw. Agents break containment, what’s next? What happens when agents leave the chat box and coding harness, and start operating across everything else. Here is a giant red lobster standing outside AI Engineer (@aiDotEngineer) London.

The interesting part of this story is not the giant lobster, but how it got there.

“I saw this viral tweet about how somebody put a claw in Wall Street next to the Wall Street bull. And I was like, “that's funny. We should put a claw in front of our conference." .. so, I asked Devin to research - ‘Where can I get a lobster in London?’ Devin comes back with phone numbers and email addresses and websites and I just click through”.

You can watch the talk here. [1]

If 2026 is the year agents break containment[2], there is a lot of work to be done. How do we design the loops that make agent autonomy useful, reliable, and aligned with user intent? We’ll unpack what it means for agents to break containment, why OpenClaw became the early proof point, and why ‘verification’ becomes the bottleneck as these systems scale.

Agents break containment

What does it mean for an agent to break containment?

About a year ago, Andrej Karpathy described the progression to software 3.0. [3] From humans writing explicit code, to training neural nets, to the prompt becoming the program.

Late last year, the prompt really became the program. [4]

“I would say in December is when it really .. flipped. where I went from 80/20, to 20/80, writing code by myself versus just delegating to agents.” - Andrej with Sarah Guo (@saranormous) [5]

A quick succession of new model releases and improvements pushed coding agent capabilities across some threshold - almost overnight, there was a clear jump forward in competence and coherence.

“Agents breaking containment” is what happens when Software 3.0 stops living inside a chat box or IDE, and starts operating across everything else.

Agents with claws.

In many ways, OpenClaw was the first real leap into the uncharted territories of software 3.0, outside the coding harness. It’s the fastest-growing open-source project in history - from 0 to 346K stars in under five months. [6][7]

It turns out the same breakthroughs in coding agent capability - agents inspecting a code base, finding bugs, and turning specs into PRs with increasing reliability - also work outside code, as long as the agent can ‘close the loop’.

Turning a conference idea into a giant red lobster is one example, Personal AI Assistant is another.

Autonomy needs a loop

Up until this point, most people's interactions with LLM’s (outside of code) have taken place within the containment chamber - the narrow chat-bot. Once an agent leaves the chat box and coding harness, the problem space changes.

OpenClaw was the first real example of this. The most crude explanation for why it got so popular is listed directly on the site - “The AI that ‘actually’ does things”. [6]

I really like Alex Krentsel’s (@AlexKrentsel) “Principles for Autonomous System Design: OpenClaw Deep Dive” [8], so that’s what we’ll use to understand exactly how OpenClaw breaks containment and closes the control loop by observing actions, results and deciding next steps.

“At the end of the day, all systems boil down to LLM calls. The difference is the context provided.” [8]

One way to explain the progressive autonomy of agents is that each new agent system wraps another loop around the LLM call.

“progression over time has been increasing loopiness”. [8]

The agent is no longer just calling a fixed set of tools in a narrow chat loop - it needs the ability to navigate ambiguity rather than stopping whenever the path is unclear.

It can operate across sessions, pickup/discover capabilities through skills and extensions, schedule future work, respond to events, and keep checking in with itself (and you) over time. There is a transition from ‘scoped tool use’ to something closer to ‘dynamic tool and workflow discovery’. [9]

If you’re interested in understanding how it all works, watch the video - it’s only an hour. He has carefully sifted through the codebase and extracted the most important parts - connectors (how humans reach the agent), coordination layer (routing, sessions, memory), agent runtime, and spent time unpacking the key abstractions.

Personal agents for everything else

The current proliferation of the ‘personal agent that actually does stuff’ boils down to this idea of enabling a system to ‘close the loop’. OpenClaw was the genesis, but we’ll see many other implementations serving different requirements and scope - Hermes and Nano-claw are two interesting examples. [10][11]

Singapore’s foreign minister recently shared his self-hosted NanoClaw setup. [12]

The labs will push their own internal agentic products with increasing autonomy and loopiness, giving users every excuse to...

2026 – agents break containment, what's next?

Related Articles

Elevated error rates on requests to multiple models

Donald Trump and sons to be 'forever' exempt from tax audits

PopuLoRA: Co-Evolving LLM Populations for Reasoning Self- Play

Old Reddit Is Down

The ultimate female fantasy – A feminist critique of Beauty and the Beast