The future of observability won't be one proprietary AI agent

mikeshi421 pts0 comments

The future of observability won’t be one proprietary AI agent. It will be thousands built by teams. | ClickHouse

Open searchOpen region selectorEnglish<br>Japanese

48.4kSign inGet Started

->Scroll to top<br>BackBlog<br>Product<br>Copy pageCopied!More actionsView as Markdown Open this page in Markdown<br>Open in ChatGPT Ask questions about this page<br>Open in Claude Ask questions about this page<br>Open in v0 Ask questions about this page

The future of observability won’t be one proprietary AI agent. It will be thousands built by teams.

Mike Shi<br>Jun 22, 2026 · 11 minutes read

The common vendor bet in observability right now is convergence.

One SRE agent, built into one platform, trained around one vendor’s view of how incidents should be investigated. It understands your telemetry, answers your questions, explains what went wrong, and eventually helps fix production before anyone opens a dashboard.

That version of the future will be useful, but it will also be too narrow.<br>Observability does not work like a generic support queue. Debugging is shaped by the systems a team owns, the way those systems fail, the data they trust, the runbooks they follow, the tools they use, and the operational scars they have accumulated over time. A database team, frontend team, payments team, and infrastructure team do not investigate production the same way.

In this post, we explore why we think the future of observability will not be one proprietary AI agent, but thousands built by teams like yours.

AI agents are becoming the new interface for observability #

I hope most people would agree that the shift to AI agents in observability is no longer a particularly bold prediction.

Today, most investigations begin with a human opening dashboards, searching logs, inspecting traces, and manually gathering context. Increasingly, that work is being delegated to agents. Models are already capable of querying telemetry, summarizing findings, identifying patterns, and generating plausible hypotheses about what is happening within a system.

As models continue to improve, the interface to observability will increasingly become human → agent → data instead of human → dashboard → data. Engineers will still decide what action to take, but much of the mechanical work of an investigation will happen before they ever touch a chart or write a query.

"We actually see a trend shift in our Slack incident channels. Engineers previously used to share links to logs or metrics. Now teams are sharing snippets of AI investigations and diving deep into it."

Anil K, DoorDash

AI changes the shape of observability workloads #

Most discussions about observability agents focus on what the agents can do. Far less attention is given to what happens underneath when they become a normal part of the workflow.

A human investigator is relatively constrained. They open a handful of dashboards, run some queries, inspect a trace or two, and gradually narrow the search space. Even experienced engineers can only evaluate so many possibilities at once.

An agent has no such limitation. While an engineer may compare two time windows, an agent can compare twenty. While a human might manually investigate a few likely causes, an agent can pursue dozens of hypotheses simultaneously, continuously gathering evidence and eliminating dead ends as it goes.

The practical consequence is that investigations become broader and place greater demands on the underlying systems. Agents can examine more historical data and explore far more potential explanations before converging on an answer. This all results in more queries requiring low-latency responses.

"Agents could brute force it — make 10 queries instead of what, if I query, I would make two dashboard clicks. Which means our API layer or storage have to be robust to take that non-linear pattern of queries."

Anil K, DoorDash

It also changes the requirements around the underlying data. Agents can only reason over the context they are given. If historical data has been discarded, important context may be missing. If telemetry has been heavily sampled, critical evidence may simply not exist in the dataset. Unlike experienced engineers, agents cannot compensate for these gaps with intuition or institutional knowledge. Their conclusions are constrained by the completeness and fidelity of the data available to them.

Most observability platforms were designed around human investigators and the workloads they generate. The next generation will increasingly need to support investigations carried out on behalf of humans.

An industry betting on the great SRE agent #

A lot of companies are investing in building for a simple vision of the future: the universal SRE agent.

This is a compelling idea. An observability vendor provides the platform, the data, and the agent. Engineers ask questions in natural language instead of learning dashboards, writing queries, or navigating telemetry. Over time, the agent becomes increasingly capable,...

observability agent agents data future open

Related Articles