The octopus architecture for AI agents | Geoff Goodman TorkBot is designed a bit like an octopus. This architecture was born from a series of dead-ends and iterative improvement. When I say octopus, what I mean is that TorkBot has a centralized “brain” directing many semi-autonomous appendages, each with their own brains, reporting back to the central dispatcher.
Static lanes are the long-lived appendages. Curator is one. Plugins can contribute others, like the Google Workspace lane. Lane templates are different. A template is a capability that can be instantiated for a bounded purpose. A sandbox snapshot is different again: it is not a collaborator at all, just a saved filesystem starting point for a future sandbox-backed lane.
Interaction vs capability
Several competing pressures are at play that pushed me into this architecture.
Responsiveness to surface interactions — The agent requires a design in which its turns are more or less bounded in complexity and can avoid I/O entirely. This allows the agent to interact quickly even when tasks or work may take quite some time.
Capability — The agent shouldn’t be limited in what it can accomplish just to keep turns efficient. It needs mechanisms to pursue complex tasks through delegation and be able to observe and steer those tasks close to real-time.
Continuity — The agent should maintain a continuous perspective and personality. The best continuity comes from a single LLM conversation that is continually curated. In this way, the personality and short-term memory don’t need to be “added in”; instead they’re a side effect of the architecture.
These pressures pushed me into a design with multiple “lanes”, as you can see in the diagram above. The “foreground” lane is the LLM conversation users interact with through surface activity. But here, I have made a bet that is likely controversial: all activity across all surfaces goes through the same foreground conversation. Threads, channels, and even platforms are all collapsed. Right now, that cognitive complexity is perhaps beyond the ability of most models and perhaps even beyond the frontier. But I’m certain that will not be the case for long.
All activity across all surfaces goes through the same foreground conversation.
Input multiplexing<br>That does not mean one model turn per event. Surface messages, system reminders and lane messages accumulate as pending input. They are injected when the target lane can accept a user message: idle, or after a tool batch has flushed.<br>This is what decouples interactivity from activity volume. Ten things can happen and still become one coherent turn at the right boundary. The catch is that the foreground model has to understand recency, priority and interruption.
Part of my thesis with TorkBot is to bet on emergent behaviour and emergent intelligence. Coming up with systems that split LLM conversations across arbitrary platform-defined boundaries is antithetical to the continuity goal. I want my agent to make links across threads and even across surfaces. I want the agent to be able to trivially continue work started in Slack and continued on GitHub. If we’re not there yet in model intelligence, I bet we will soon be and the agentic system designed for that world will stand above the competition in terms of intuitiveness and power.
How the octopus works
The octopus idea is doing actual work here. It is the shape of the harness problem.
This is not jumping on the sub-agent bandwagon for the sake of clout. This is a design that emerged and earned its existence. After all, it comes back to context management. Each appendage gets its own context.
The foreground hands off work to other lanes by ‘talking’ to them. Inter-lane communication is just text, betting on the idea that pre- and post-training skew heavily towards prose as the carrier of intent. The foreground picks a lane template — and if it is a sandbox lane, a VM snapshot — and passes an initial message to that lane about what it wants. For lanes that are already spawned, a simple message is sent.
Lanes can own the messy work of doing a bunch of tool calls, hitting dead-ends, doing I/O and any number of more complex sandbox-enabled workflows. That mess stays contained in the lane’s context. Lanes communicate between each other via two mechanisms:
Chat, as described above; and
References to virtual filesystem artifacts via the lane’s ./shared folder.
Sandbox note<br>All lanes get a minimal sandbox-like environment thanks to just-bash. Sandbox lanes are the only lanes that get a full micro VM, thanks to @torkbot/sandbox. Whether in-memory or using a full Linux VM, all lanes share a common directory structure and virtual filesystem which allows them to pass references and share a common understanding.
The foreground conversation can stay continuous across surfaces, which is what I want for personality and cross-thread intuition, without becoming the place where every intermediate artifact goes to die. The...