Context engineering for analytics agents: six months of building and rebuilding

Context engineering for analytics agents: lessons from six months of building and rebuilding | Cassis Skip to content Blog Docs

Talk to us

Back to stories Context engineering for analytics agents: lessons from six months of building and rebuilding Analytics agents need context. Great. How should you structure it? Lessons from six months of building, testing, and rebuilding it.

Aloÿs Augustin and Matthieu Blandineau June 29, 2026

On this page

Summary

Our current context structure: a navigation tree Let the agent navigate the same structure Make the update path obvious Make the context auditable Author once, generate the rest Analytics agents need context. Great. How should you structure it?

Over the past six months, we have built and rebuilt the context behind our analytics agents several times. We started with a structured entity graph, machine-friendly definitions, and a layer that checked each query plan against the model and refused anything it could not ground, before generating SQL.

We liked it. Then we removed most of it.

We only kept the part that mattered: when the agent cannot ground an answer, it refuses instead of guessing. The problem was not a lack of context. It was deciding where each fact belonged, how the agent would find it, and how that fact could change without leaving stale copies behind.

The rule that matters the most here is more for maintenance than pure context quality: one fact, one authoritative authored home. The same fact can appear in generated views or be referenced elsewhere. But there should only be one place where it is edited. This is what keeps maintenance manageable.

Our current context structure: a navigation tree

(root) Company data global context ├── customers domain │ ├── CUSTOMERS table │ │ ├── CUSTOMER_ID column │ │ └── COUNTRY column │ ├── CUSTOMER_ACTIVITY table │ ├── active_customers metric │ └── defining-active-customers additional context document └── revenue domain ├── INVOICES table ├── PAYMENTS table ├── net_revenue metric └── measuring-revenue additional context document This is the navigation tree, not the complete storage format. Every element has an address, its path from the root, and any document can link to any other element by that address. So the tree is really a directed graph: a document under customer activity can point straight at a revenue metric, not only at its own parent.

The agent follows those links the way you would follow references in a wiki. Tables and metrics are assigned to domain paths, columns sit inside tables, and joins are structured links with column pairs, conditions, and cardinality. Each domain can also carry markdown for the business context that does not fit into those fields.

A synonym for COUNTRY goes on the column. The grain of CUSTOMERS goes on the table. The formula for active_customers goes on the metric. A rule spanning invoices and payments goes in measuring-revenue. Only rules that apply everywhere sit at the root.

For everything else, we move up this ladder until we find the lowest level that fully owns the fact:

column synonym └── column description or note └── table description └── domain context └── root context Markdown carries the connective tissue: how several tables work together, which path answers a question, or how to disambiguate two definitions. It should not repeat schema or joins the agent already receives in structured form.

We first used flat domains that mostly mirrored warehouse tables. It looked clean until a rule spanned three tables. Nested domains, including tableless ones, gave those rules an address. The join between customer activity and revenue stays in structured metadata; the domain context explains when crossing it is correct for the business question.

Let the agent navigate the same structure

At query time, the agent starts with a small map of the root domains and a single tool to explore the context. It requests a path and receives that node’s context, children, table summaries, and metrics, then asks for detailed schemas and joins only when needed. This is model-directed routing over a deterministic graph, not a search or grep tool that hands back a pile of more or less relevant matches for the agent to sort through.

That makes placement matter. A table in the wrong domain is difficult to discover; an unassigned one is effectively invisible. Put everything at the root, and every question receives almost everything, which is how you get context rot: the more loosely relevant material sits in the window, the worse the model reasons over what actually matters.

In one internal test, constraining the agent to explore the relevant branches instead of giving it a full search tool reduced context-token usage by 45%.

Walking the graph also gives a guarantee that free exploration does not. To reach a low-level domain, the agent has to pass through the path that leads to it, usually its parent branch, sometimes a link from another document. Whatever it pulls arrives...

Context engineering for analytics agents: six months of building and rebuilding

Related Articles

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

Apertus – Open Foundation Model for Sovereign AI

How to Earn a Billion Dollars

Italy's Meloni says Trump 'made up' story that she 'begged' him for photo at G7