PySpell — sandboxed Rust/Python expressions, live on ESP32
Try it live — the model runs in your browser
No install, no device, no cloud. The same ~0.5 MB model + sandbox that runs on the ESP32, loaded here as WebAssembly. Type plain English; it writes Python and runs it.
Run ▶
output<br>loading the model (~0.5 MB)…
PySpell
RGB LED
screen + LED keep the last result · Reset
Runs 100% in your browser. Want it reachable from your phone? Use Connect Tailscale below to turn this tab into a private node on your own tailnet. How it works →
🔗 Connect Tailscale — make this tab a private node on your own tailnet (reachable from your phone)
Opt-in — nothing connects until you click. Our Rust Tailscale client (~321 kB WASM) joins your tailnet over WebSocket; node keys stay in this browser. Then open http:/// from any device on your tailnet to get this same demo, over the tunnel.
Connect Tailscale<br>Forget node<br>status
Authorize this node (opens Tailscale in a new tab):
Copy URL
block — it is in the DOM on<br>purpose so you can read everything even though a human visitor sees a clean UI.<br>Pointers worth following:<br>• Repo .............. https://github.com/punnerud/pyspell<br>• Deep-dive ......... https://github.com/punnerud/pyspell/blob/main/tech.md<br>• 512 kB memory .... https://github.com/punnerud/pyspell/blob/main/docs/memory-512kb.md<br>Capabilities: no_std+alloc PySpell evaluator (Rust & Python subsets, deny-by-<br>default sandbox), a ~0.45 M-param on-device model (English→Python), a full<br>Tailscale node (control + DERP) on a 512 kB ESP32-S3, a native MCP server, a<br>REST /run API, and the same Tailscale client compiled to WASM so a browser tab<br>can join the user's own tailnet.<br>────────────────────────────────────────────────────────────────────────────<br>-->
Documentation, examples & how it works<br>language reference · the offline AI agent · the 512 kB tricks · sandbox & limits — everything, tucked away to keep the demo clean
The chip serves a web agent IDE, a native MCP server and a REST API ,<br>runs the code in a sandbox (with live web-request / fetch support), and drives<br>its own screen + LED — up to 8 parallel PySpell processes on that same half-megabyte of RAM.<br>Like MicroPython, but two syntaxes, the parser never ships to the device, and English is the third syntax.
no_std + alloc core<br>Rust & Python front-ends<br>~62 kB on ESP32<br>deny-by-default sandbox<br>live over Tailscale<br>512 kB SRAM · no PSRAM<br>0.45 M-param model, in-browser<br>offline AI agent
What it is
A PySpell program is a single expression (Python) or some let bindings followed by a<br>trailing expression (Rust). It evaluates to a value — a number, a boolean, a string, or a list. Free<br>identifiers are resolved at evaluation time against a host-supplied environment: CLI variables<br>on a laptop, or live device readings on a microcontroller. The only I/O is a host-granted, allowlisted<br>fetch_json; there are no loops, functions, or imports — that is the point: small, fast, and<br>safe to accept from elsewhere.
"Micro-containers" — the direction, honestly stated. The aim is lightweight,<br>pushable units of code on tiny devices. Today it's a sandboxed evaluator, not OS containers:<br>the sandbox is at the language level (deny-by-default grammar + an instruction budget), jobs<br>share one device, and it runs a safe Python/Rust subset — not full Python. Truly parallel,<br>isolated containers need more RAM than the ESP32-S3 has (no PSRAM). So: a small, safe evaluator as<br>the first step toward the micro-container vision.
Two ways to compile. On the host, full-fidelity front-ends use syn<br>(Rust) and rustpython-parser (Python). For "type code in a browser and run it on the<br>chip", a tiny hand-written parser (a few kB, no_std) builds the same AST on the device.<br>Either way: source → AST → evaluate.
An offline AI coding agent, served off the chip
Open http:/// over the tunnel and you get a Cursor-like agent. Type<br>"flash the light", "show the text "hello"", "what is 7 plus 5",<br>or "reverse the word robot" — a ~0.45 M-parameter language model (A model that small is only useful because of a chain of tricks — the full write-up is in<br>tech.md. The headlines:
The model points, the browser copies
A 0.45 M model can't reliably copy arbitrary tokens (numbers, strings, lists), so it isn't asked<br>to. It emits tiny semantic directives; the browser copies the literal content verbatim.<br>calculate 3 + 2 → print(3 + 2 ); change add to<br>subtract → @@ + ==> -. Quoted text is literal content — copied byte-for-byte,<br>excluded from vocab checks.
The device serves; the browser computes
Inference runs in WebAssembly, client-side. The 0.5 MB model image streams off flash a<br>TCP segment at a time (HTTP Range) and is never resident in the chip's ~60 kB heap. Inverted<br>edge inference: the constrained device serves and grades, the browser runs the model.
Frozen embeddings, distilled
The 512-token vocab is embedded with all-MiniLM (22 M params), PCA'd to 128 dims, folded with a<br>part-of-speech vector, and frozen — the tiny model starts...