Show HN: Kintsugi – a local-first safety net for AI agents and humans

AI coding agents now run real shell commands on your machine — rm -rf, git push --force, DROP TABLE, dd, writes straight to disk. Almost always that s fine. The one time it isn t (a hallucinated path, a prompt-injected instruction, a confident wrong guess) there s no undo and you find out after.Kintsugi sits between the agent and your system. It catches the dangerous command before it runs, explains it in one plain sentence, makes destructive actions reversible with a snapshot, and writes every command every agent ran to an append-only, hash-chained log you own. Local-first: no cloud, no account, nothing leaves the machine.It s not only for AI. A passive bash/zsh recorder (no agent involved) puts every command a person runs on the same tamper-evident log and snapshots the destructive ones just-in-time — so `kintsugi undo` rolls back a DBA s fat-fingered rm -rf or clobbering overwrite the same way it rolls back an agent s. On a managed host you can seal the settings behind an admin password, enforced daemon-side with brute-force lockout, so an agent or a normal user can t quietly turn it off.The design rule I cared most about: the decision to block a catastrophic command is made by deterministic rules a human wrote — never an LLM. A local model can only explain a command and add caution to the ambiguous middle; it can never unlock or downgrade a rule-based block. So the block is predictable and can t be talked out of by a clever prompt.A few things that turned out to matter:- It parses real shell structure, not text. Two passes — a fast tokenizer and a true bash AST parser (brush-parser, pure Rust) — and it takes the more cautious verdict. That catches commands hidden inside $(...), here-docs, subshells, and if/for/while blocks, which substring scanners wave through. echo $(rm -rf /) is caught. - It fails toward caution. A line the parser can t fully understand is held, never assumed safe. The hard invariant, enforced by a golden corpus, is zero catastrophic-classified-as-safe. - It works at the process/PATH layer, not as a per-tool plugin. Native pre-tool hooks for Claude Code, Cursor, Codex, Qwen, Gemini, Copilot, OpenCode — plus a $PATH shim and an MCP server for everything else, including a raw bash script or a Makefile. `kintsugi init` wires them in one command.I want to be honest about the guarantee, because a lot of tools in this space oversell. Kintsugi is a seatbelt, not a kernel firewall. Hooks are an interception layer — an agent in a yolo/auto-approve mode, or a process that calls a binary by absolute path, can bypass them. That s exactly why there s a filesystem-watcher backstop: the promise is nothing is unrecoverable, NOT nothing runs un-warned. And the admin lock defeats an agent or a non-root user — it does not stop root. It guards against mistakes, not a malicious same-user process with root.I ran an adversarial assessment against it: 0/176 dangerous commands leaked to safe across a MITRE ATT CK + GTFOBins corpus, 1.4M fuzz inputs — which surfaced one real heap-DoS, now fixed, and no crashes since — and zero unsafe blocks. Every figure is reproduced by a committed test.It s Rust, MIT, cross-platform (macOS/Linux/Windows). Install is one line and it works immediately with no model — an optional local GGUF just sharpens the plain-English summaries: curl -fsSL https://github.com/arrowassassin/kintsugi/releases/latest/download/install.sh | sh # or, from source: cargo install kintsugi This is an early release and I d genuinely like to be told where the model is wrong — both false alarms on safe commands and, more importantly, any catastrophic command that slips through. Happy to answer anything about the rule engine, the AST approach, or the threat model.

Show HN: Kintsugi – a local-first safety net for AI agents and humans

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

Apple WWDC 2026 Livestream

Claude Fable 5

US Government directive to suspend access to Fable 5 and Mythos 5

German ruling declares Google liable for false answers in AI Overviews