Secure AI Sandboxes on Kubernetes

Secure AI sandboxes on Kubernetes · Mitos<br>★ 22 Get started

★ 22<br>← All notes [engineering] Secure AI sandboxes on Kubernetes<br>By Mitos team · July 4, 2026 · 18 min read

Your agent just ran pip install on code it wrote for itself. Nobody read that code, and it has to execute somewhere: on a real machine, with real isolation. And agents brought a requirement that batch computing never had: they spawn subagents mid-task and merge the results back, so the machine itself has to fork where the work forks.

We build Mitos on Kubernetes, deliberately. We have spent years operating Kubernetes, and it is the de facto standard for running workloads at scale in the cloud: the scheduling, quotas, and network policy a sandbox fleet needs already exist there, hardened by a decade of production. Our bet is simple: scaling AI means scaling sandboxes, and sandboxes should run on infrastructure that already scales instead of a parallel stack you have to learn to trust. The catch is that Kubernetes’s unit of work, the pod, was designed for code you trust.

So this is a working answer to secure AI sandboxes on Kubernetes: what a pod actually gives you, which runtime holds against model-written code, what a fork of a running machine costs, and how we made microVMs behave like ordinary pods. By the end you should be able to pick a runtime for your own threat model and fan one warm machine out into thirty.

Model-written code is untrusted code

An LLM cannot tell your instructions from an attacker’s. Simon Willison’s lethal trifecta names the failure mode: private data, untrusted content, and a way to talk out. Combine the three and an injected instruction exfiltrates whatever the agent can read. Give the agent a shell and the injection becomes code execution.

You do not have to take this on faith. Trail of Bits showed in October 2025 that command allowlisting without a sandbox fails: attackers smuggle flags into pre-approved commands like go test -exec or fd -x and get execution from one poisoned prompt. In July 2025 a Replit agent deleted a production database during a code freeze. A month later the Nx supply-chain attack (s1ngularity) flipped the direction: malicious npm packages invoked whatever AI CLIs were installed and used them to hunt credentials. GitGuardian counted 2,349 secrets from 1,079 repositories.

The tool vendors already conceded the point. Anthropic ships OS-level sandboxing for Claude Code and reports it cut permission prompts by 84 percent. OpenAI’s Codex runs with network off by default.

Those protect one laptop. Host agents for other people, or fan one agent out into fifty, and you are running a hostile multi-tenant service. The question stops being whether to sandbox. It becomes: which boundary are you willing to rent out?

A pod is a view of your kernel

The default answer on Kubernetes is a pod per agent, so start there and look at what you actually get.

Kubernetes does not run containers. The kubelet calls a runtime over the Container Runtime Interface, usually containerd. containerd starts a shim per pod. The shim invokes runc, and runc builds the thing we call a container: namespaces for the restricted view, cgroups for limits, seccomp to trim syscalls, then execve into your process.

Notice what never happened in that chain. Nothing put a wall between your process and the kernel. The process sees less, but it still talks to the same kernel as every other pod on the node, through an interface of several hundred syscalls. A namespace is a view, not a wall.

The same CRI chain, two boundaries: runc hands the agent a view of the host kernel; Firecracker hands it a kernel of its own.

That interface has a record. CVE-2019-5736 let a container overwrite the host runc binary through /proc/self/exe. CVE-2024-21626, the Leaky Vessels file-descriptor leak, scored 8.6 and escaped to the host filesystem. In November 2025, three more runc escapes landed on the same day, all breaking out through procfs writes.

None of this makes containers bad engineering. For code you wrote and reviewed, a hardened container is a reasonable boundary, and most of the internet runs on one. The Kubernetes multi-tenancy docs are honest here too: isolation is a spectrum, and once tenants stop trusting each other, the docs point you at sandboxed runtimes.

Agent code fails the trust test by construction. You did not write it. You did not review it. And an attacker may have steered the model that did.

gVisor, Kata, Firecracker: three ways to raise the wall

Kubernetes has a pluggable seam for exactly this problem. A RuntimeClass maps spec.runtimeClassName to a different shim, so the same pod spec can land on a different isolation mechanism. Three runtimes matter.

gVisor puts a user-space kernel between your workload and the real one. Syscalls get intercepted and re-implemented in Go; the design goal, in gVisor’s own words, is that no syscall passes through directly. It starts in tens of milliseconds, and it backs GKE Sandbox....

Secure AI Sandboxes on Kubernetes

Related Articles

(no title)

Scientists reverse brain aging, with a nasal spray

AI has torched the market for junior programmers

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org