Context Sculpting

Context Sculpting | Perception Theory

Context Sculpting

05 Jun, 2026

A few months ago, I was reading “The Anatomy of an Agent Harness” by Viv (@Vtrivedy10). It’s a deep dive on what a harness is, why it’s important, and which components make up a harness.

In some sense, being a software developer has always involved staying up to date on the latest developments in your field, and this was a very good overview of this “harness” concept that has emerged over the past year or so.

At some point while I was reading the article, a vision popped into my head: what if instead of treating the context window as an immutable, append-only conversation log, we let the model inspect and modify the context window itself?

It dawned on me how ingrained the append-only conversation history view of the context window had become since the release of ChatGPT. Just about every developer now has the mental model of a system prompt, a user message, and then an ever-growing list of user messages, assistant messages, tool calls, and tool call results.

But what if we dropped that assumption? What if we made the context window mutable, by the model itself?

My head was kind of spinning as I thought more about this. What if the model could identify when it was going down a wrong path, or if it was spinning its wheels and getting stuck? What if it could manage the context window itself, “auto-compacting” earlier parts of the history to prevent, or at least delay, exhausting the context window? What else could we do, or really could the model do, with this approach?

But the more I thought about it, the more unsure I became. Could a model really identify problems in the response it had just generated a few moments ago?

Initially, I was thinking that giving the model full edit control of the context window could lead to more efficient results, by avoiding going down wrong paths or getting stuck in a loop, or editing out earlier sections that were no longer relevant. But I started to feel uneasy about the idea, like maybe to make it work in practice, I would just be wrapping the existing context window with even more context explaining to the model what it could do. I wondered if the models would need to be trained on this kind of idea before being able to put it into practice effectively.

I started exploring the idea in a session with Claude, and we settled on the approach of having a larger model observe and edit the context window of a smaller, weaker model. Claude came up with a few suggestions for what to call this idea, and “context sculpting” was the one I settled on. It captured a certain quality of the idea that I liked.

Quick aside: a week or two after I was playing around with this idea, I saw Anthropic post about the “advisor strategy”, where you “[p]air Opus as an advisor with Sonnet or Haiku as an executor, and get near Opus-level intelligence in your agents at a fraction of the cost.” So maybe my idea of context sculpting wasn’t completely crazy.

Source: https://x.com/claudeai/status/2042308622181339453

Another quick aside: I was thinking a lot about recursive language models (RLMs) around the time I read Viv’s article, so that probably had some effect on the context sculpting idea as well.

I decided I wanted to find out if this idea was feasible. I must admit, initially I was harboring delusions of grandeur, that I could design a rigorous experiment, write up the results in a paper, get accolades from AI researchers on X, etc. I quickly scaled back my ambitions and shifted to what might be better described as a small “engineering case study”. Personally, I like to think of it as “vibe research”.

I worked with Claude on a plan, and this was the core question my “vibe research” aimed to address:

What happens if you give one model permission to rewrite another model's working context between turns?

I suggested we build a custom harness using the Pi agent harness framework because I like its minimalistic and highly extensible approach, and Claude agreed because Pi has all the hooks needed to implement this idea.

We called the more capable model the “outer agent”, and the smaller model the “inner agent”. After settling on what felt like a solid plan with Claude, I switched over to Codex for the actual implementation and experiment runs.

Codex also took the liberty of drafting a blog post. To be clear, I did not just copy/paste the model’s blog post here - these are my own words and thoughts - but I thought it’d also be helpful to intersperse some of Codex’s writing here. This is the section on project architecture, for example:

The harness is a simple two-layer loop:

An inner agent works on the actual task.

After every completed inner turn, an outer agent inspects the full inner context.

The outer agent chooses one of four actions:<br>pass_through

rewrite_context

rollback

terminate

At a high level, the loop looks like this:

while...

Context Sculpting

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

It's Not Just X. It's Y

Amazon, Facebook, FBI have access to a private intelligence-sharing network

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy