I built a (small) long running agent

I built a (very small) long running agent - by Jason Ganz

The Analytics Engineering Roundup

SubscribeSign in

I built a (very small) long running agent

Jason Ganz May 31, 2026

Last week, Tristan wrote about his experiment with agent swarms. I wanted to follow this up with an experiment into another emerging agent modality - long running agents. Why? We like to make sure our beloved readers of the Roundup are kept as up to date as possible about where the underlying technologies that power data teams are going. And, if you’ll forgive me for saying so, we do a pretty good job at it. Regular readers of this newsletter learned about the rise of agents for useful data work in 2023. It took us until about last November to get to a world where the agents we described there became common in production, helped along the way by improvements in the underlying models, improvements in the harnesses running them and in new open standards like MCP and agents skills powering these agents. It’s pretty incredible to see the world we saw a blurry picture of in 2023 become a reality, and today we’re casting that net out again. So now that we’re in a world where agents can reliably perform tasks with data. What do we do with that? There’s a number of patterns. Some are already common today: Set up systems to perform one off tasks on some trigger. The best example of this is the increasingly common analytics agent slack channel, where a business user comes with a question, kicks off an analytics agent and receives an answer.

Augment and accelerate human work via paired development with an agent. This is what happens when you are building a dbt project alongside an agentic harness such as Claude Code or the dbt Developer Agent.

Then there’s some that we are seeing glimmers of, but aren’t as widespread just yet: Multiagent systems that monitor complex workflows and take targeted action when warranted. Tristan wrote about this last week in his post I built a (very small) agent swarm.

Long running agents that are able to handle substantially more complex goals than current agents. An example of this would be “migrate my entire dbt project to Iceberg” or “refactor my dbt project from Star Schema to Data Vault”. Anyone that’s familiar with current agentic systems would rightly be very uncomfortable trying to do this. But is it possible, or on the near trajectory to being possible?

We are in the early innings of long running agents

I’ve been obsessed with long running agents ever since Cursor put out a blog post showing how they had built a web browser from scratch earlier this year. This post blew my mind because it totally, completely upended my upper bound for the level of complexity an agent could handle - this is not a single task, this is something that would take teams of dedicated software engineers weeks or months to complete. This was followed up shortly afterwards by Anthropic’s clean room implementation of a C compiler. These are research projects and come with a ton of caveats. Firstly, these are both extremely verifiable tasks, so the agent could always track its success against an external oracle. And the end result of these reimplementations while impressive, aren’t something you’d actually use. Still, these projects make it clear that for some subset of highly verifiable projects, agents can do much, much more than we think they can today. A popular variant of the long running agent is the “Ralph Loop”, where you give an agent a goal and it keeps going until it has verified that it’s done it. This allows for an agent to continue iterating on a problem much longer than it previously would have. Recently the major harnesses have been implementing this with the inclusion of the goal command (ex 1,ex 2). So I figured it was time to bring long running agents to the data world and see what they can do. And that’s how I ended up building Tinyberg. The simplest variant of a long running agent

I settled on a project to build a clean room, read only implementation of an Iceberg table inspector. The actual implementation was done in as simple a format as possible - I used the goal functionality in Claude code to ask it to build out an implementation of the Iceberg spec using no external references other than an oracle aka a verifier which would test the output of this system against the Iceberg implementation. Here’s the exact workflow I ran: Determine what to build and create a spec: I collaborated with GPT 5.5 on the initial project scope and idea and we cowrote a simple spec as our end goal.

Build Tinyberg: a deliberately small, read-only Apache Iceberg table inspection and scan-planning library.

The goal is to implement enough of the Iceberg table metadata model to understand a local Iceberg table directory without relying on DuckDB, Spark, PyIceberg, or any existing Iceberg reader in the implementation.

Tinyberg should be able to:

1. Load a local Iceberg table from disk - Locate and parse the table...

I built a (small) long running agent

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

Apple WWDC 2026 Livestream

It's Not Just X. It's Y

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy