Show HN: Mirrors – test AI agent changes by replaying real production traces

aisinghal2 pts0 comments

Mirrors - Test AI Agents Against a Mirror of Production

Test agents against a real<br>mirror of production.<br>Mirrors turns your production traces into an isolated copy of your agent's environment. Run a change against it and see what breaks before your users do.<br>Build your first mirror →View the demo<br>Connect Mirrors to your agent. No API keys, just sign in via browserClaude CodeCursorVS CodeChatGPTCodexClaude / Desktop<br>Run in your terminalcopy<br>claude mcp add --transport http mirrors https://api.runmirrors.com/mcp<br>Then run /mcp → mirrors → Authenticate via browser.<br>100%<br>reproducible<br>prod calls<br>to a live mirror

airline-support mirror<br>›Run ▸<br>Cancel reservation HQ8ML2 and refund to th…What’s the baggage allowance on my SFO → E…Rebook me onto the next morning departure.

get_reservation_detailsprovided<br>38ms100%

get_user_detailsmirror db<br>22ms98%

cancel_reservationprovided<br>41ms100%

issue_refundllm-sim<br>67ms91%

AGENTDone. Reservation HQ8ML2 (SFO → EWR) is cancelled and a $420.00 refund is on its way to your card ending 4242. As a Gold member there’s no cancellation fee. Anything else?

get_reservation_detailsprovided<br>35ms100%

get_user_detailsmirror db<br>24ms98%

get_baggage_policyllm-sim<br>58ms90%

AGENTOn your SFO → EWR flight (economy), your Gold membership includes 2 checked bags up to 50 lb each, plus a carry-on and a personal item. Extra checked bags are $40 each. Want me to add one?

get_reservation_detailsprovided<br>37ms100%

search_flightsmirror db<br>44ms97%

rebook_reservationprovided<br>52ms100%

AGENTRebooked HQ8ML2 onto UA478, departing SFO at 8:05 AM tomorrow (was today’s 6:40 PM). No fare difference on your Gold fare. Your new confirmation is UA478. Anything else?

mirror worldseed 730412 · #a3f1c9e2<br>reservationflightstatus<br>HQ8ML2SFO → EWRconfirmed<br>RT41KPJFK → LAXconfirmed<br>ZX9QW0ORD → SEAcancelled

same seed + prompt → byte-identical world. Every tool call replays against this deterministic mirror, not production.

how it worksTraces in, a runnable mirror out

01Ingest traces<br>Drop in production traces from your ADK or observability platform. Mirrors finds the entities, rebuilds the schema, and discovers every tool.

02Build the mirror<br>You get an isolated, runnable copy of your prod environment: a seeded database plus bound tools, each scored for how closely it matches the real traces.

03Run and evaluate<br>Replay agents against the same world every time. Measure accuracy, catch regressions, and ship with confidence. Production is never touched.

why mirrorsWhat a mirror unlocks<br>Catch what would have broken in prod, and ship the change knowing it works.

Reproduce any bug on demand<br>The same seed and instructions give a byte-identical world, so the failure that paged you shows up every time.

Test the risky flows safely<br>Run refunds, deletes, and sends against the mirror. Your live systems never see them.

Catch regressions before they ship<br>Pin golden cases to recorded worlds and grade every build pass or fail.

Know if a change is better<br>Coverage and accuracy are scored per tool, so you ship on numbers instead of a hunch.

Sandboxes on demand<br>Each run gets its own mirror with on-demand launch, scale to zero, and metering by the minute.

Drive it from your own code<br>A versioned /v1 API and workspace keys let you run mirrors from your own apps.

pricingStart free. Scale when you're ready.<br>Build mirrors free, with deterministic seeding and the in-app playground. When your team needs unlimited sandboxes, the API, and SSO, we'll tailor a Custom plan.

Free<br>$0/mo<br>60 sandbox min / mo

✓Build unlimited mirrors<br>✓Deterministic seeding<br>✓In-app playground<br>✓Community support<br>Start free<br>FOR TEAMSCustom<br>Let's talk<br>Built around your team

✓Everything in Free<br>✓Unlimited on-demand sandboxes<br>✓Public /v1 API + keys, SSO<br>✓Eval suites + fidelity reports<br>✓Priority support & onboarding<br>Email us

Ship agent changes without the guesswork.<br>Build a mirror from your traces in minutes.<br>Get started free →

mirror mirrors production traces against free

Related Articles