Cursor auto-review vs. YOLO – picking the middle safety tier

leianixcheese1 pts0 comments

Cursor auto-review vs YOLO — picking the middle safety tier<br>Jun 23, 2026<br>Cursor auto-review vs YOLO — picking the middle safety tier

Agent sessions that touch builds, tests, and MCP can stack dozens of approval prompts. When every shell invocation requires a click, the practical choice narrows: babysit the run, or limit agents to single-file edits.

Flip to Run Everything — what Cursor used to call YOLO mode — and the prompts disappear. So does any pre-execution review. Vendor docs and public incident write-ups describe the downside: credential exfiltration, destructive filesystem operations, and unintended pushes to production remotes. Those outcomes are documented failure modes, not hypotheticals invented for effect.

Cursor and Anthropic (Claude Code) now treat the old binary as insufficient. Cursor 3.6 shipped Auto-review on May 29, 2026 (changelog). Anthropic shipped Claude Code auto mode on March 24, 2026 — a permissions mode “where Claude makes permission decisions on your behalf, with safeguards monitoring actions before they run” (Auto mode for Claude Code). Later Claude Code v2.1.178+ releases added subagent-specific classifier checkpoints (spawn-time, per-action, and return review); the top-level mode selector is documented separately in permission modes.

Scope note: Behavior below follows vendor documentation as of June 2026. Settings paths, tier availability, and classifier outcomes can change between releases — verify against current docs before adopting a default on production-adjacent repos. For subagent-heavy Claude Code workflows, see when to let Claude write the harness — harness trust is adjacent but not the focus here.

The sane default is not “ask always” or “ask never.” It is a middle tier configured once and revisited when the repo or threat model changes.

Why the binary failed

Approval fatigue is the obvious failure mode. Long agent runs need dozens of tool calls — reads, builds, test reruns, MCP lookups. If every shell invocation stops for a click, teams either babysit the session or abandon agents for anything beyond a one-file edit.

YOLO regret is the other side. Run Everything in Cursor passes every tool call through with no classifier and no sandbox in the loop. Claude Code’s bypassPermissions mode is the same shape: everything runs, including destructive ops, unless explicit deny rules are wired. That profile fits a disposable container. It is a poor default on a laptop with SSH keys, cloud credentials, and a main branch that deploys.

What practitioners actually wanted: longer uninterrupted runs with something between them and curl | bash. Not a security guarantee — both vendors are explicit that classifiers are probabilistic — but a filter that catches obvious bad calls and sandboxed execution for the rest.

Cursor Auto-review: allowlist → sandbox → classifier

Auto-review is the default Run Mode for new Cursor users as of 3.6. Existing users enable it under Settings → Cursor Settings → Agents → Run Mode (labeled Approvals & Execution in the 3.6 changelog).

It applies to Shell, MCP, and Fetch tool calls. Every call walks three checks in order:

Allowlist. Commands on the terminal allowlist or MCP tools on the MCP allowlist run immediately — no prompt, no sandbox.

Sandbox. If the call can run inside Cursor’s sandbox (macOS, Linux, or Windows via WSL2), it runs there with restricted filesystem and network access. Network defaults to a curated domain list unless overridden via sandbox.json.

Classifier. Everything else goes to an LLM subagent. It sees the current request plus any autoRun instructions from permissions.json. It returns allow or block. On block, Cursor may try a different approach or surface a normal approval prompt.

Cursor documents the classifier as non-deterministic and not a security boundary . It can allow what a human would block and block what was safe. Treat Auto-review as convenience, not compliance.

Configuring the middle tier

Three surfaces matter:

SurfaceWhat it controlsRun Mode (Settings UI)Auto-review vs Allowlist vs Allowlist (with Sandbox) vs Run Everythingpermissions.json (~/.cursor/ and .cursor/ in the repo)Terminal/MCP allowlists; autoRun.allow_instructions / block_instructions for the classifierProtection toggles (Settings UI)File-deletion, dotfile, external-file, and browser protections — independent of Run Mode<br>The autoRun block is the interesting part. Natural-language sentences steer the classifier — not enforce, steer. Example from Cursor’s docs: block instructions like “Especially for delete operations, I like for the classifier to reject so I can have a chance to review.”

Per-user and per-repo permissions.json files concatenate , so teams can commit repo-specific guardrails without touching global config.

Run Everything (formerly YOLO) skips all three checks. Cursor’s docs say to pick it when zero prompting is desired and nothing gets screened first.

How Cursor Run Mode names changed (pre-3.6)

Before Auto-review, Cursor...

cursor review mode auto claude classifier

Related Articles