How We Reined In AI Agents With pre-commit · Merrilin.ai Blog↓<br>Skip to main content
Merrilin.ai Blog
Merrilin
Table of Contents
Table of Contents
A lot of teams now seem to rely on AI code review bots to review code written by AI. That does work,<br>to a point, but there is something slightly absurd about paying for a second model to inspect the<br>output of the first when older tools like pre-commit<br>already did a good job of catching many classes of mistakes early, locally, and without expensive<br>licenses. We do use CodeRabbit on our pull requests, but we also wanted to catch code smells much<br>earlier: in pre-commit hooks on developer machines, and again in GitHub pipelines before bad<br>patterns have a chance to settle in.<br>So we added a guard in Merrilin to rein in exactly that behavior. It uses tree-sitter to reject<br>fragile error-handling patterns before agent-written code gets committed. It does not just search<br>for HTTPException or catch. It parses Python, TypeScript, and TSX into syntax trees and asks<br>much more interesting questions:<br>Did someone raise a raw HTTPException instead of a typed API error?<br>Did they catch a database exception and then keep going without a rollback or savepoint?<br>Did they write reader progress in a critical path without a visible recovery boundary?<br>Did frontend code read error.response.data.detail directly instead of using the shared<br>normalizer?<br>Did someone add a .catch() that only logs to console and silently swallows the failure?<br>There is a certain wonder in watching a machine produce so much working code so quickly. There is<br>also a certain exhaustion in watching it rediscover the exact same bad ideas at scale. This guard<br>has been one of the most effective ways we have found to keep AI-assisted development inside the<br>boundaries of our system design.<br>Why we built it<br>Merrilin is an AI reading companion. That means we have one non-negotiable invariant:<br>do not break reading ever.
Optional systems can fail: AI can fail, telemetry can fail, sync can fail, analytics can fail. But<br>opening a book, turning a page, saving progress locally, and resuming offline still need to work.<br>The problem is that AI agents are fantastic at reproducing tiny error-handling shortcuts that look<br>harmless in isolation:<br>raise HTTPException(...)<br>except Exception: logger.exception(...)<br>except IntegrityError: pass<br>catch (error) { console.error(error) }<br>throw new Error("something went wrong")<br>Individually, these are easy to rationalize. Collectively, they create outages, inconsistent client<br>behavior, poisoned transactions, and impossible-to-centralize error semantics, and they are exactly<br>the kind of thing an AI agent will keep doing unless you give it a hard boundary. We wanted<br>something stricter than prompt instructions and more precise than regexes, so we put the rules in<br>code.<br>Where it lives<br>The error-handling guard matters most here, but it sits inside a broader pre-commit setup that gives<br>AI-generated diffs fewer places to hide. Our .pre-commit-config.yaml also enforces:<br>protected branch safety with no-commit-to-branch<br>basic hygiene checks like JSON/YAML/TOML validation, merge-conflict detection, symlink checks, AST<br>validation, private-key detection, whitespace cleanup, and line-ending normalization<br>Conventional Commit messages at commit-msg time<br>Alembic migration naming and single-head checks<br>backend Ruff linting and formatting<br>Prettier for JS, TS, and Markdown<br>ESLint for the web and admin apps<br>That matters because AI agents rarely fail in only one dimension. The same model that invents a<br>broad except Exception will also happily leave formatting drift, hand-write invalid Alembic<br>migrations, or produce inconsistent commit messages unless your repo pushes back.<br>The error-handling guard itself is wired into .pre-commit-config.yaml:<br>- id: error-handling-patterns<br>name: Guard centralized error handling patterns<br>entry:<br>uv run --project backend --extra dev python scripts/check_error_handling_patterns.py<br>--changed-lines<br>language: system<br>files: ^(backend/app/.*\.py|apps/(web|mobile|admin)/src/.*\.(ts|tsx))$<br>stages: [pre-commit]
Taken together, these hooks keep the codebase clean, the schema history sane, the commit history<br>legible, and the tree-sitter guard keeps the agent from reintroducing reliability bugs you already<br>learned not to ship.<br>The Alembic hooks deserve special mention. Since Feb 7, we have landed roughly 331 PRs in this repo.<br>In that same stretch, we have touched backend/alembic/versions in 93 commits, and the folder<br>currently contains 81 migration files. There is something exhilarating about shipping that fast.<br>There is also the quiet fatigue of realizing your migration history has become a place where humans<br>and agents can both make a mess with full confidence.<br>In that kind of environment, AI agents start getting overconfident. They see a pile of migrations,<br>infer a human numbering convention that was never really meant to be one, and begin writing new<br>files by hand with integer prefixes like...