ETLs in the Era of AI and Sandboxes

Agentic Airbyte | Visual Execution Tutorial

Flow Board / intent becomes evidence step 1 / plan

Agentic Airbyte execution flow Goal enters the AI harness, the harness calls Crabbox, Crabbox injects a credential profile into a worker, Airbyte moves data from source to target, and evidence returns to the harness.

Goal policy

AI harness writes spec chooses next action

Crabbox lease + run collect artifacts

Profile scoped env

Worker repo + env Airbyte runs

Source API / DB

Evidence logs / JUnit metrics / config

Target warehouse

AI Harness Reads goal and writes a bounded spec.

calls Crabbox ->

Crabbox Leases a worker and injects the named profile.

runs command ->

Worker + Airbyte Reads source and writes target. Agent never sees rows.

returns evidence ->

Evidence Logs, metrics, JUnit, redacted config.

control = intent + command credentials = profile -> env data = source -> target evidence = artifacts -> decision

1Plan 2Lease 3Inject 4Move 5Prove 6Repair

owner: agent The agent compresses intent into a job spec.

It reads the goal, repo state, schemas, previous evidence, and policy. It emits a bounded run. It does not move data.

Input goal + policy + repo state

Output job JSON + Crabbox command

run replay One job, traced from request to repair.

Click a row. The main flow jumps to the same boundary.

00:00 Request Sync CRM accounts into the warehouse before 08:00.

00:02 Spec Agent writes refs, profile, validation, retry, artifacts.

00:05 Lease Crabbox finds a warm worker and returns a run id.

00:31 Sync Airbyte reads source.crm.accounts and writes target rows.

02:14 Validate Worker emits counts, schema drift check, JUnit, config.

02:18 Decision Agent sees one failing freshness check and emits a repair job.

phase: request Human or schedule gives intent.

The input is a business goal, not a connector config dump. The agent still has to choose the run shape.

Owner human / schedule

Input goal + deadline + policy

Output bounded intent for the harness

Evidence none yet

mental model Everything is easier when each box owns one question.

Read the system as 4 contracts. Each box gets a narrow input, owns one decision, and emits a narrow output.

agent / driver What should run next?

Turns intent and previous evidence into the next bounded job spec.

Reads goal, repo, evidence Writes job spec, next action

crabbox / runner Where can it run safely?

Leases the sandbox, hydrates code, injects scoped env, captures outputs.

Reads pool, command, profile Writes run id, artifacts

airbyte / mover How do rows move?

Runs the connector in the worker. Source and target never enter model context.

Reads source, target, env Writes target rows, sync status

evidence / judge What happened?

Converts logs and checks into finish, retry, repair, or alert.

Reads logs, JUnit, metrics Writes decision, repair input

owner: agent The agent is the driver, not the pipe.

It can inspect repo state and evidence, then write the next bounded job. It should never carry production rows or raw secrets.

Can do choose refs, profile, validation, retry, next action

Must not do move rows, store secrets, sanitize artifacts

Intent -> Spec Goal becomes refs, profile, retry policy, validation, artifacts.

Spec -> Run Spec becomes a sandboxed command with a durable run id.

Profile -> Env Profile name becomes scoped variables inside the worker only.

Source -> Target Connector moves rows directly. The prompt never becomes the data plane.

Worker -> Evidence Execution becomes logs, JUnit, metrics, counts, redacted config.

Evidence -> Action Signals become finish, retry, repair, or alert.

bad design Agent becomes the integration runtime. Secrets, rows, logs, and retries collapse into one prompt loop.

bounded design Agent chooses experiments. Systems execute them. Each boundary has one owner, one input shape, and one evidence shape.

red flags The model context becomes the data plane.

The agent sees too much, owns too much, and cannot prove what happened. Debugging depends on an unstructured transcript.

Failure mode secret leakage, copied rows, non-repeatable retries

Fix move data movement back into a worker with scoped credentials and artifacts

runnable shape The runnable shape has 3 contracts.

A useful agent output is not prose. It is a spec contract, an execution handoff, and an evidence contract.

ai-agent-dispatch.sh Copy

# Goal: sync CRM accounts into the warehouse safely.

crabbox pool ensure example-org/data-movement/main/provider/linux/etl \ --min-ready 3 \ --create -- \ --cache-volume airbyte-etl

mkdir -p .crabbox/generated cat > .crabbox/generated/accounts-sync.json --json crabbox artifacts download --out evidence/

job spec anatomy

Refs source + target Profile scoped env Proof artifacts + tests

Spec Contract plan

Agent writes: refs, profile name, allowlists, validation, retry, artifact globs. Agent never writes: raw secrets or copied rows.

Execution Handoff lease

Crabbox...

ETLs in the Era of AI and Sandboxes

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

It's Not Just X. It's Y

Amazon, Facebook, FBI have access to a private intelligence-sharing network

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy