Beyond the Semantic Layer: Building a Context Layer for the Agentic Era

zazuke2 pts0 comments

Beyond the Semantic Layer: Building a Context Layer for the Agentic Era | Kaelio<br>Try ktx<br>The context layer for data agents.<br>Open source. Free to start.<br>Get started →<br>Back to blogMore in Context layer

Simon SpätiData Engineer & Technical Author<br>LinkedIn

June 4, 2026·Last reviewed June 4, 2026·16 min read<br>Context layerSemantic layerAI data agent<br>Beyond the Semantic Layer: Building a Context Layer for the Agentic Era<br>At a glance

A context layer puts your warehouse schema, joins, metric definitions, and business knowledge in one reviewable place so data agents query governed context instead of guessing field names. A look at how it works, and at ktx, the open-source context layer.<br>Reading time<br>16 minutes

Last reviewed<br>June 4, 2026

Topics<br>Context layerSemantic layerAI data agent

Metrics, schema, dashboard logic, and domain knowledge in one place. Your team reviews. Agents query.

Writing SQL was never the hard part. Making it accurate and trustworthy against your warehouse always was. Point an AI agent like Claude or Codex at your data stack and ask a real analytics question, and the answer is usually mediocre: the agent can scrape some context from your git repos or whatever metadata it can find, but it doesn't know your joins, your metric definitions, or the business rules that give a number its actual meaning.

So how do we make data agents reliable and accurate for database queries? Everyone talks about harnesses, evals, and context layers, but the real challenge is bringing them together with data engineering and the context you already have, such as database schemas, a semantic layer, metric definitions, plus the business knowledge that normally never reaches the agent.

That's the question this blog tackles: how agents can work with the data stack and analytics, and how a context layer fits in. We also take an inside look at ktx , a new context layer that reads from the usual sources but also the less obvious ones (Markdown, Notion, etc.), driven by agentic workers.

The idea is to pull two kinds of knowledge into one reviewable place: the hard semantics (your warehouse schema, joins, and metric definitions as YAML and SQL) and the soft semantics (the business context living in docs, wikis, and Notion that agents usually never see). Both are committed to git and reviewed like code, so a human stays in the loop while agents get a warm start instead of a cold database connection. The payoff: more accurate answers with fewer (and cheaper) queries against the warehouse.

A context layer sits between your data sources and your agent. It ingests the warehouse, BI tools, Notion, and docs into hard semantics (YAML and SQL) and soft semantics (Markdown), all reviewed in git, then serves governed, accurate SQL to the agent.

SOURCES

Warehouse<br>schema · metrics

BI tools<br>dashboards · joins

Notion + wiki<br>soft semantics

Docs + Markdown<br>notes · context

CONTEXT LAYER<br>Auto-built. Reviewed in git.

Hard semantics<br>YAML + SQL the warehouse runs

Soft semantics<br>Markdown the team reads

reviewed like code

AI agent<br>accurate SQL · governed

A context layer ingests your warehouse, BI tools, and docs into hard and soft semantics, then serves governed SQL to your agent.

Modeling with Analytics AI Agents, with a Context Layer

Every business or data analyst faces mediocre results when prompting Claude or Codex on their data stack. It might figure out some context by reading the git repos or other metadata it can find. Still, the hard part is teaching the internals and business context that are unique to each business domain and company. So, how do we make AI agents for data and analytics reliable and accurate for database queries ? Do we need to manually copy and paste Google Docs and Markdown files into the prompts, or can a context layer provide a more reliable, safe, and governed way to ask an AI assistant for analytics?

These days, it's much easier to ingest or add almost any number of new data sources with custom-built ETL data pipelines, whereas before we had to make hard decisions about what to include and what not. An AI agent, such as Claude or Codex, can merge multiple data pipelines that access the source database via CLIs and destinations via MCP, API, or CLI. But we still need an API and a process for updating source data, not just once. We need to make sure, test, and verify that the data is correct, potentially more than ever.

The challenge remains in modeling the data in a way that represents what the business is, making sure data flows fast but is also correct. But any AI assistant is only as good as the context we give it, and how easily it can read and express context, metrics, models, and knowledge. So, does the context layer solve these problems?

The context layer primarily supports the accuracy of SQL queries, continuous updates to the business context, and governance . Additionally, with a newer context layer, we can include more relevant business insights that are stored internally, usually in...

context layer data agent business agents

Related Articles