Where AI Agents Belong in Data Engineering: The Correctness Layer | Altimate AIYou need to enable JavaScript to run this app. Back to blog☀☾
With ever-changing models, new and better ones coming out every few months, it's great if we don't have to rely on them too heavily. The better your tooling, the less dependent you become on any single model. That's also why the deterministic harness matters: a correctness layer that lets you reproduce outputs and trace lineage regardless of which model you're running underneath. This is especially true during maintenance or extending the project, where verification is the real job.
The danger isn't only a crash or an error message, but a wrong number that didn't break. It might be a clean query, but it introduces duplicated rows.
In this article, we go through the three levels of AI agents in data engineering, how to structure projects so the AI delivers its best outcomes, and how dedicated agents with a deterministic core help us build higher-quality pipelines — ones we can actually trust. And we look at a practical example of how it works with a blast radius analysis.
The Three Levels of AI Agents in Data Engineering
Why should we use agents for data engineering? And at what levels can agents help us productively? As LLMs will always have some error tolerance, as humans do too, we need a way to be more confident in producing the code.
Chat-phase, Autonomous and Dedicated Tooling
There are different levels of confidence and levels on which the agents can help us.
The initial chat-phase : the development where we prompt Claude or ChatGPT. The model tries to understand the context based on what it has access to. It takes a decent amount of tokens, as it needs to scan everything from scratch.
The autonomous approach , where Claude Code or Codex also have access to the tools humans have, mostly the CLI on the terminal, making it possible to query Postgres with psql or read from S3 or Parquet with DuckDB to verify queries and data. A much higher quality outcome.
Dedicated agents for the task at hand. E.g., for data, the tools know dbt or know how to transpile SQL code deterministically, meaning not from training data only, but with an actual tool that does it much faster and more reliably. Built-in checks and features a "general" agent can't provide.
Showcasing the three levels of AI agents in data engineering
Ideally, we'd want to always use the dedicated tools, but there isn't always one.
Where in the DE Lifecycle Each Level Actually Helps
BI Dashboards vs. Plumbing the Data Pipelines, or Creating Source Ingestions, or Maintaining? For data engineering, the question is not only if there is dedicated agent tooling, but also on what part of the data engineering lifecycle AI agents can help data engineers and analysts the most, and potentially even domain experts?
The lifecycle contains the ingestion part, ETL, or understanding the business in great detail, or is it just to visualize the result? Or should it cover maintenance in case of overnight ETL errors, or the full data lifecycle?
In general, before we go into more details later, agents can help us on the full cycle, but it always depends on who you are and what role you play. Building from scratch with no knowledge or seniority is dangerous. Why? Because they can't verify if the produced code is correct. Okay for a side project or a proof of concept, but not for actual production.
What's the Engineering Discipline for Working with AI?
There's also a part that is less technical, a way of guiding the agents in the right direction. Especially if we want to safely use it in large projects or organizations, we can't just let it run without guidance.
For that we need:
clear project structure in which the agents can flourish. The more is given, the fewer tokens are used for this work, and it will be more aligned across the project. (Another reason a deterministic workflow such as uv init is best, because it will always be the same).
build with clear instructions (agentic skills, superpowers, etc.) on how the tools are used (basically providing CLIs and API documentation). This is the bulk of the work anyway. That's the data architecture, the brainstorming with fellow humans before you build something, instead of missing a key insight in the beginning and then letting the agent run down the wrong path. Also, be realistic: prompting "be correct" or "use state-of-the-art" won't make it more correct or more state-of-the-art than the model was trained on. So if it's a rather new architecture, it's a must that you provide these links and hints.
set up the project in a modular fashion, so the agents cannot break the whole project if they make a small change, so you don't end up in a scenario like dependency hell with everything dependent on each other.
use a declarative approach , with descriptive configuration that says the what and not the how, so that you can collaborate on these configs with the agents,...