SmithDB, the data layer for agent observability

We built SmithDB, the data layer for agent observability

Learn

DocsCompany

PricingTry LangSmith

Get a demo

Try LangSmith

Get a demo

LangSmith

We built SmithDB, the data layer for agent observability

Ankush Gola

May 13, 2026

11 min

Go back to blog

Create agents

Key Takeaways Agent traces have outgrown traditional observability stores — modern agent traces contain hundreds of nested spans, multi-modal content, and spans that stay open for hours, creating data volumes and query patterns that general-purpose databases were never designed to handle. SmithDB delivers industry-leading performance across every key observability workload — with P50 latencies of 92ms for trace tree loads, 400ms for full-text search, and 82ms for run filtering, it makes core LangSmith experiences up to 15x faster than before. A portable, scalable architecture built for enterprise needs — backed by object storage with stateless ingestion and query services, SmithDB scales by adding compute rather than managing local disks, making it straightforward to deploy in self-hosted and multi-cloud environments.

We’re launching SmithDB, our purpose-built distributed database for agent observability that now backs core LangSmith workloads. SmithDB gives LangSmith industry-leading performance across key observability workloads, the portability to run wherever customers need their data to live, and the flexibility to support agent-native query patterns that traditional observability stores were not designed for.

Agents present a new data problem In agent observability, traces serve as the agent’s core behavioral record. When LangSmith first launched in 2023, AI applications were relatively simple: teams were building RAG pipelines, prompt chains, and very early agents. Since then, agents have become more ubiquitous and longer running, LLM context window sizes have increased dramatically, and workloads increasingly contain more multi-modal content, such as images and audio. As a result, the trace data associated with modern agents has exploded in both volume (number of traces) and size (individual payload size). A modern agent trace can have hundreds of deeply nested spans.

In addition to being large and nested, agent traces also arrive in pieces: a start event for an agent span can arrive minutes, maybe even hours before an end event. The query patterns needed to analyze this data have also gotten increasingly complex. Agent observability needs to support: Random access: instantly load an individual run or trace Interactive filtering: slice large trace datasets by metadata, feedback, latency, errors, tags, and time Full-text search: find phrases and patterns inside agent run inputs and outputs JSON filtering: query arbitrary user-defined metadata and structured tool outputs Tree-aware queries: filter based on root runs, child runs, or any node in a trace Thread reconstruction: rebuild long-running conversations across many agent traces instantly Aggregations: compute cost, latency, token usage, evaluator scores for different filters Supporting all of them, at low latency, over large agent traces, with self-hosting and multi-cloud requirements, requires a fundamentally new architecture. That is the motivation behind SmithDB. Introducing SmithDB SmithDB is LangSmith’s data layer purpose-built for agent observability and evaluation workloads. It is built in Rust and leverages the Apache DataFusion query engine and Vortex file toolkit, with heavy customizations for LangSmith’s unique workloads. At a high level, SmithDB is made of three components: Object storage for durable trace data A small Postgres metastore for segment metadata Stateless ingestion, query, and compaction services

Performance Performance is not just a nice-to-have for observability. For both humans and agents, slow observability tools become a bottleneck in the agent development loop. SmithDB delivers leading performance across the key workloads that matter for agent observability and makes core LangSmith experiences up to 12x faster than before.

Workload SmithDB latency

Trace tree load P50 92ms / P99 595ms

Single run load P50 71ms / P99 358ms

Runs filtering P50 82ms / P99 434ms

Trace ingestion P50 630ms / P99 1.47s

Full-text search P50 400ms / P99 870ms

Threads filtering P50 131ms / P95 268ms

Portability Because SmithDB is backed by object storage, there are no local disks to manage. Query and ingestion services are stateless. The system scales by adding compute, while durable data lives in object storage. This makes SmithDB much easier to deploy in self-hosted and multi-cloud environments than traditional database clusters that require local disks and complex sharding. SmithDB is now serving production traffic Today: 100% of US Cloud ingestion goes to SmithDB 100% of tracing UI query traffic goes to SmithDB, including threads All major filters are backed by SmithDB, including metadata, feedback, text search, tree filters, and trace filters Product...

SmithDB, the data layer for agent observability

Related Articles

Amazon, Facebook, FBI have access to a private intelligence-sharing network

SpaceX not the behemoth everyone thought

The Mirror Is Part of the Machine

Elevated error rates on requests to multiple models

Donald Trump and sons to be 'forever' exempt from tax audits