We built SmithDB, the data layer for agent observability
Learn
DocsCompany
PricingTry LangSmith
Get a demo
Try LangSmith
Get a demo
LangSmith
We built SmithDB, the data layer for agent observability
Ankush Gola
May 13, 2026
11<br>min
Go back to blog
Create agents
Share
Key Takeaways<br>Agent traces have outgrown traditional observability stores — modern agent traces contain hundreds of nested spans, multi-modal content, and spans that stay open for hours, creating data volumes and query patterns that general-purpose databases were never designed to handle.<br>SmithDB delivers industry-leading performance across every key observability workload — with P50 latencies of 92ms for trace tree loads, 400ms for full-text search, and 82ms for run filtering, it makes core LangSmith experiences up to 15x faster than before.<br>A portable, scalable architecture built for enterprise needs — backed by object storage with stateless ingestion and query services, SmithDB scales by adding compute rather than managing local disks, making it straightforward to deploy in self-hosted and multi-cloud environments.
We’re launching SmithDB, our purpose-built distributed database for agent observability that now backs core LangSmith workloads.<br>SmithDB gives LangSmith industry-leading performance across key observability workloads, the portability to run wherever customers need their data to live, and the flexibility to support agent-native query patterns that traditional observability stores were not designed for.
Agents present a new data problem<br>In agent observability, traces serve as the agent’s core behavioral record.<br>When LangSmith first launched in 2023, AI applications were relatively simple: teams were building RAG pipelines, prompt chains, and very early agents.<br>Since then, agents have become more ubiquitous and longer running, LLM context window sizes have increased dramatically, and workloads increasingly contain more multi-modal content, such as images and audio.<br>As a result, the trace data associated with modern agents has exploded in both volume (number of traces) and size (individual payload size). A modern agent trace can have hundreds of deeply nested spans.
In addition to being large and nested, agent traces also arrive in pieces: a start event for an agent span can arrive minutes, maybe even hours before an end event.<br>The query patterns needed to analyze this data have also gotten increasingly complex. Agent observability needs to support:<br>Random access: instantly load an individual run or trace<br>Interactive filtering: slice large trace datasets by metadata, feedback, latency, errors, tags, and time<br>Full-text search: find phrases and patterns inside agent run inputs and outputs<br>JSON filtering: query arbitrary user-defined metadata and structured tool outputs<br>Tree-aware queries: filter based on root runs, child runs, or any node in a trace<br>Thread reconstruction: rebuild long-running conversations across many agent traces instantly<br>Aggregations: compute cost, latency, token usage, evaluator scores for different filters<br>Supporting all of them, at low latency, over large agent traces, with self-hosting and multi-cloud requirements, requires a fundamentally new architecture.<br>That is the motivation behind SmithDB.<br>Introducing SmithDB<br>SmithDB is LangSmith’s data layer purpose-built for agent observability and evaluation workloads.<br>It is built in Rust and leverages the Apache DataFusion query engine and Vortex file toolkit, with heavy customizations for LangSmith’s unique workloads.<br>At a high level, SmithDB is made of three components:<br>Object storage for durable trace data<br>A small Postgres metastore for segment metadata<br>Stateless ingestion, query, and compaction services
Performance<br>Performance is not just a nice-to-have for observability. For both humans and agents, slow observability tools become a bottleneck in the agent development loop. SmithDB delivers leading performance across the key workloads that matter for agent observability and makes core LangSmith experiences up to 12x faster than before.
Workload<br>SmithDB latency
Trace tree load<br>P50 92ms / P99 595ms
Single run load<br>P50 71ms / P99 358ms
Runs filtering<br>P50 82ms / P99 434ms
Trace ingestion<br>P50 630ms / P99 1.47s
Full-text search<br>P50 400ms / P99 870ms
Threads filtering<br>P50 131ms / P95 268ms
Portability<br>Because SmithDB is backed by object storage, there are no local disks to manage. Query and ingestion services are stateless. The system scales by adding compute, while durable data lives in object storage.<br>This makes SmithDB much easier to deploy in self-hosted and multi-cloud environments than traditional database clusters that require local disks and complex sharding.<br>SmithDB is now serving production traffic<br>Today:<br>100% of US Cloud ingestion goes to SmithDB<br>100% of tracing UI query traffic goes to SmithDB, including threads<br>All major filters are backed by SmithDB, including metadata, feedback, text search, tree filters, and trace filters<br>Product...