Build Agents, Not Pipelines

alexsanjoseph1 pts0 comments

Build agents, not pipelinesThere are only two ways to use LLMs in a computer program: as part of a pipeline, or as an agent. In other words, either you express the control flow of the program in code, or you give a LLM tools and allow it to manage the control flow itself1.

Here’s how you might structure a trivial “summarize a bunch of information and email it to me” program as a pipeline:

context = gather_context(various, data, sources)<br>llm_response = llm_summarize(context)<br>summary = parse(llm_response)<br>email_me(summary, my_email)

And here’s how you’d do it as an agent:

read_data_tool = build_read_data_tool(various, data, sources)<br>email_tool = build_email_tool(my_email)<br>run_agent(tools: [read_data_tool, email_tool])

It’s like the difference between a library and a framework. When you use a library, you define the structure of the program yourself, and call out to various library helpers along the way. When you use a framework, the main structure of the program lives in the framework, and it calls your code at various points. There are tradeoffs involved in both approaches. Frameworks let you get started more quickly and typically give you features “for free”, but can be difficult when you want to do something that isn’t part of the framework’s design. Libraries give you a lot more control, but require you to write (and maintain) more boilerplate code.

In the trivial case, the distinction between a pipeline and an agent melts away. If you only have a few paragraphs of possible context for the problem, an agent with a gather_context and an email_me tool will perform exactly the same steps as a pipeline that calls a reasoning model with the context injected into the prompt (i.e. the agent will reproduce the trivial control flow of your pipeline). But when you have more context than will fit into a single prompt, or you want to take an action and then react to the result, the choice between pipelines and agents becomes very significant.

Predictability, flexibility and intelligence

Pipelines are more predictable, but agents are more flexible . When you give a problem to an agent, work stops when the LLM thinks it’s done. Depending on the perceived difficulty of the problem, this can take anywhere from a few LLM turns to hundreds (and thus cost anywhere from a few cents to many dollars). If you’re building something intended to run at scale, this unpredictability can be a nightmare. Any subtle change to the user data could cause the LLM to take twice as long on each task, which would double your latency2 and cost.

Pipelines are only immune to this problem if they don’t use reasoning models, or don’t allow the model to “think out loud” in its output tokens (for instance, by using structured output). However, individual LLMs offer much tighter control over model reasoning than over how long an agentic loop will take. In all frontier model APIs, you can explicitly set the level of reasoning you want. That doesn’t give you total control, but it does cap “take longer” at maybe ten or twenty percent (instead of with agents, where it can be 2x or more).

Why use agents, then? Agents are smarter . If you’re happy to accept the unpredictability, an agentic system can handle much more difficult tasks, by virtue of being able to loop for longer, and to gather more information after thinking about the problem. There’s a reason that the most successful AI products (coding agents like Claude Code, Codex, Cursor, and Copilot3) are agents: coding is a hard enough task that you simply cannot build a functional coding agent with pipelines.

Context-gathering

The context-gathering stage is far more delicate for pipelines than for agents . If an agent is trying to solve a problem and realizes it needs more data, it can simply go and get it. But for a pipeline, all the required data has to be present in the context already, because the LLM only gets to run once.

Much of the work involved in building pipelines is in getting context-gathering right. Agents are much easier. For instance, with a coding agent, you can basically just provide a “grep” and “read file” tool and let the agent figure out what chunks of code are relevant to the current file. In a pipeline, you have to figure that out yourself: good luck, it’s an unsolved technical problem! Typically you’ll end up doing some set of clever tricks, like walking the AST to identify which parts of code “contribute” to the current file, or indexing the whole codebase with semantic embeddings and doing some kind of nearest-neighbor search to build the context (called RAG, or “retrieval-augmented generation”). Neither of these will work as well as using an agent.

In 2023 and 2024, many people believed that RAG would solve context-gathering. Every LLM would have a fully-indexed context base that would magically surface the precise information the LLM needed at any given moment. This did not happen. Instead, we went backwards, getting our agents to do plain-text search and...

agents context agent pipelines pipeline problem

Related Articles