Building Agents in Go: Without a Framework

roseway41 pts0 comments

Building Agents in Go Without a Framework

We're hiring! Come build with us →

Log in

Book a demo<br>Sign up

Product<br>Agent Memory<br>Context Lake<br>Context Graph Engine<br>Graphiti

Enterprise<br>Pricing

Developers<br>Documentation<br>API Reference<br>Blog

Resources<br>Research<br>S&P Market Intelligence Report<br>Trust Center<br>Context Engineering

Company<br>About<br>Careers<br>Contact

Book a demo<br>Sign up

Featured

architecture-patterns

Building Agents in Go Without a Framework

A production agent is a long-running, concurrent, I/O-bound process that spends most of its time waiting on a model, a tool, or a human. That shape fits Go's runtime. This post explains why, surveys the Go framework options, and shows how to build an agent without one.

Daniel Chalef

18 Jun 2026<br>&bull; 11 min read

Key takeaways<br>A production agent is a long-running, concurrent, I/O-bound process. That shape fits Go's runtime: goroutines start at about 2KB each, channels carry coordination and streaming, and context.Context cancels a run across every library at once.<br>Go ships as one static binary, which removes interpreter pinning and virtual-environment reconstruction from deployment. The difference shows most at the edge and in customer-managed environments.<br>The Go framework field is real but young. ADK Go reached 1.0 in November 2025, alongside Genkit Go (Firebase) and Eino (ByteDance). Most teams skip frameworks, because the agent harness (the loop that drives the model) is about forty lines of Go.<br>The major model vendors support Go. OpenAI, Anthropic, and Google ship native Go SDKs, and the official openai-go client also talks to any OpenAI-compatible endpoint (vLLM, Ollama, OpenRouter) through a base URL.<br>Components attach to the loop one at a time: the official MCP Go SDK (v1.6.0) for tools, Zep's Go SDK for agent memory with sub-200ms retrieval, and Hatchet or Temporal for durable execution.<br>Go fits the agent runtime: the loop that calls models, runs tools, and stays alive for minutes or hours gets concurrency, cancellation, and single-binary deployment without a framework.<br>Most writing about AI agents assumes Python or TypeScript: the frameworks, the tutorials, and the example repositories. In production, a growing number of teams write the agent itself in Go and put it behind a React / TypeScript front end.<br>Zep is one of them. Much of Zep is written in Go. The choice is pragmatic rather than dogmatic: our custom inference servers are written in Rust, and Graphiti, our open source graph framework, is written in Python. Each component runs in the language that fits its workload. For the agent runtime, that language is Go.<br>The shape of a production agent run<br>Once an agent has real users, its runtime has a consistent profile. Runs are long, lasting seconds to hours rather than the milliseconds of a web request. They are expensive: the agent drives work that would otherwise need a human operator, such as development environments, browser sessions, and document processing, so an abandoned run is wasted spend. They spend most of their wall-clock time waiting on a model, a tool, or a human. They run concurrently, many at once, each in a different state.<br>This is a different workload from the request-response service most backend tooling targets: many concurrent, long-lived, I/O-bound processes.<br>How Go's model maps to that workload<br>Go was built for concurrent network services, and the agent workload is a concurrent network service with longer-lived units of work.<br>Goroutines are cheap. Each starts with about 2KB of stack, and the scheduler runs them across all CPU cores. Runs are I/O-bound, so one goroutine per run costs little and your own runtime is rarely the bottleneck. The ceiling is usually upstream provider rate limits and the memory each run holds (conversation history, open connections), not goroutine count. A CPU-bound moment, such as deserializing a large tool result, does not stall the whole process the way it can in a single-threaded runtime such as Node or Python.<br>Inside one process: each run is a goroutine costing about 2KB, mostly parked on I/O. The scheduler multiplexes the few runnable ones onto a handful of cores — so your runtime is rarely the bottleneck.Channels carry coordination. An agent often needs to stream partial output to a user while waiting on the next model call, or pass control between sub-agents. Channels model this directly, and a run can be written as a stateless step that takes messages in and returns messages out, so any worker can pick up the next step.<br>context.Context cancels work across the whole call tree. When a user stops a run that has already cost ten dollars, you cancel one context and the in-flight request and every downstream tool stop. You still pay for the tokens already generated, but the expensive downstream work halts. The cancellation convention is more uniform in Go than in Python or Node, and goleak catches the libraries that ignore it.<br>One static binary is the deployment artifact. There is no...

agent context runtime framework model agents

Related Articles