The Anatomy of an Agent Harness

Learn

DocsCompany

PricingTry LangSmith

Get a demo

Try LangSmith

Get a demo

Agent Architecture

The Anatomy of an Agent Harness

Vivek Trivedy

March 10, 2026

12 min

Go back to blog

Create agents

Key Takeaways Break down complex objectives: Planning tools let agents decompose tasks, track progress, and adapt as they learn Delegate work in parallel: Spawn subagents for independent subtasks, each with isolated context

By Vivek Trivedy TLDR: Agent = Model + Harness. Harness engineering is how we build systems around models to turn them into work engines. The model contains the intelligence and the harness makes that intelligence useful. We define what a harness is and derive the core components today's and tomorrow's agents need. Can Someone Please Define a "Harness"? Agent = Model + Harness If you're not the model, you're the harness. A harness is every piece of code, configuration, and execution logic that isn't the model itself. A raw model is not an agent. But it becomes one when a harness gives it things like state, tool execution, feedback loops, and enforceable constraints. Concretely, a harness includes things like: System Prompts Tools, Skills, MCPs + and their descriptions Bundled Infrastructure (filesystem, sandbox, browser) Orchestration Logic (subagent spawning, handoffs, model routing) Hooks/Middleware for deterministic execution (compaction, continuation, lint checks) There are many messy ways to split the boundaries of an agent system between the model and the harness. But in my opinion, this is the cleanest definition because it forces us to think about designing systems around model intelligence. The rest of this post walks through core harness components and derives why each piece exists working backwards from the core primitive of a model.

Why Do We Need Harnesses. From a Model's Perspective There are things we want an agent to do that a model cannot do out of the box. This is where a harness comes in. Models (mostly) take in data like text, images, audio, video and they output text. That's it. Out of the box they cannot: Maintain durable state across interactions Execute code Access realtime knowledge Setup environments and install packages to complete work These are all harness level features . The structure of LLMs requires some sort of machinery that wraps them to do useful work.For example, to get a product UX like "chatting", we wrap the model in a while loop to track previous messages and append new user messages. Everyone reading this has already used this kind of harness. The main idea is that we want to convert a desired agent behavior into an actual feature in the harness. Working Backwards from Desired Agent Behavior to Harness Engineering Harness Engineering helps humans inject useful priors to guide agent behavior. And as models have gotten more capable, harnesses have been used to surgically extend and correct models to complete previously impossible tasks. We won’t go over an exhaustive list of every harness feature. The goal is to derive a set of features from the starting point of helping models do useful work. We’ll follow a pattern like this: Behavior we want (or want to fix) → Harness Design to help the model achieve this.

Filesystems for Durable Storage and Context Management We want agents to have durable storage to interface with real data, offload information that doesn't fit in context, and persist work across sessions. Models can only directly operate on knowledge within their context window. Before filesystems, users had to copy/paste content directly to the model, that’s clunky UX and doesn't work for autonomous agents. The world was already using filesystems to do work so models were naturally trained on billions of tokens of how to use them. The natural solution became: Harnesses ship with filesystem abstractions and tools for fs-ops. The filesystem is arguably the most foundational harness primitive because of what it unlocks: Agents get a workspace to read data, code, and documentation. Work can be incrementally added and offloaded instead of holding everything in context. Agents can store intermediate outputs and maintain state that outlasts a single session. The filesystem is a natural collaboration surface. Multiple agents and humans can coordinate through shared files. Architectures like Agent Teams rely on this. Git adds versioning to the filesystem so agents can track work, rollback errors, and branch experiments. We revisit the filesystem more below, because it turns out to be a key harness primitive for other features we need. Bash + Code as a General Purpose Tool We want agents to autonomously solve problems without humans needing to pre-design every tool. The main agent execution pattern today is a ReAct loop, where a model reasons, takes an action via a tool call, observes the result, and repeats in a while loop. But harnesses can only execute the tools they have logic for. Instead of...

The Anatomy of an Agent Harness

Related Articles

Elevated error rates on requests to multiple models

Donald Trump and sons to be 'forever' exempt from tax audits

PopuLoRA: Co-Evolving LLM Populations for Reasoning Self- Play

Old Reddit Is Down

The ultimate female fantasy – A feminist critique of Beauty and the Beast