Same Opus 4.7, 33% fewer tokens: where coding agent cost comes from

Opus 4.7 for 33% less: How Auggie beats Claude Code on cost and quality | Augment Code Skip to content

ProductAgentCosmosCode ReviewIntentCLISlackContext Engine MCPChangelog

Context EnginePricingDocsBlogResourcesCustomersCareersContextWikiStatus PageTrust CenterSecurity

svg]:px-3 flex-1" href="/contact">Talk to Salessvg]:px-3 flex-1" href="https://app.augmentcode.com/">Sign in

svg]:px-2.5 group rounded-full px-4" href="/install">Install

Back to Blog TL;DR: We benchmarked Auggie vs Claude Code on Opus 4.7. Auggie takes a modest lead in quality (67.4% vs 66.3% pass rate) while costing ~33% less, thanks to sharper retrieval that results in token efficiency. Augment’s Context Engine was built to deliver high-quality results on large, complex codebases. As frontier models have improved, engineering leaders’ questions have shifted from “can it do this?” to “what does it cost at our scale?”. Usage is exploding, and token spend is now a board-level line item. Because OpenAI and Anthropic dominate the frontier-model market, neither is motivated to make coding agents cheaper to run. For Augment, token efficiency is a key differentiator and point of pride. Below we show a head-to-head comparison between Augment’s agent Auggie and Claude Code on Opus 4.7. The headline: matched quality, at 33% less cost. Combined with optimal model routing with Prism, Augment customers can expect to save up to 50% on state of the art models to get the same quality output. Same model, 33% discount: Terminal Bench 2.0 on Opus 4.7 We ran Terminal Bench 2.0 with Auggie CLI and Claude Code head to head using Opus 4.7 and default settings, on a GCP n4-highcpu-16 VM (16 vCPU, 32 GB RAM). The benchmark was run via the Harbor framework with five attempts per task and four tasks executing in parallel.

Same model, 32% fewer tokens, 33% lower spend.

The pass-rate gap (1.1%) sits inside the variance we see across runs of any single benchmark, but the cost gap doesn't. And in the table below you can see where the savings come from: reduced tokens. Cache reads, the volume of historical context replayed each turn, drop by 32%. Output tokens by 37%. That's the Context Engine and our harness doing what it was built to do: less wasted exploration, fewer expensive turns. [role=checkbox]]:translate-y-[2px] font-bold" style="border-right:1px solid var(--neutral-gray)">Token category (Opus 4.7)[role=checkbox]]:translate-y-[2px]">Auggie CLI[role=checkbox]]:translate-y-[2px]">Claude Code[role=checkbox]]:translate-y-[2px]">Delta[role=checkbox]]:translate-y-[2px] font-bold" style="border-right:1px solid var(--neutral-gray)">Total tokens[role=checkbox]]:translate-y-[2px]">367,587,892[role=checkbox]]:translate-y-[2px]">543,090,485[role=checkbox]]:translate-y-[2px]">−32%[role=checkbox]]:translate-y-[2px] font-bold" style="border-right:1px solid var(--neutral-gray)">Output tokens[role=checkbox]]:translate-y-[2px]">7,217,279[role=checkbox]]:translate-y-[2px]">11,381,425[role=checkbox]]:translate-y-[2px]">−37%[role=checkbox]]:translate-y-[2px] font-bold" style="border-right:1px solid var(--neutral-gray)">Cache read tokens[role=checkbox]]:translate-y-[2px]">341,980,440[role=checkbox]]:translate-y-[2px]">506,455,124[role=checkbox]]:translate-y-[2px]">−32%[role=checkbox]]:translate-y-[2px] font-bold" style="border-right:1px solid var(--neutral-gray)">Cache write tokens[role=checkbox]]:translate-y-[2px]">17,960,193[role=checkbox]]:translate-y-[2px]">25,219,909[role=checkbox]]:translate-y-[2px]">−29%[role=checkbox]]:translate-y-[2px] font-bold" style="border-right:1px solid var(--neutral-gray)">Total cost (USD)[role=checkbox]]:translate-y-[2px]">$463.04[role=checkbox]]:translate-y-[2px]">$694.50[role=checkbox]]:translate-y-[2px]">−33% Auggie on SWE-Bench Pro: Higher Quality, 23% Lower Cost The same pattern holds on SWE-Bench Pro, a widely recognized benchmark for coding tasks. We ran it on the same head-to-head setup, three attempts per task, eight batches in parallel.

Harder benchmark, same shape: ahead on quality, 23% cheaper per task.

Auggie edges ahead on quality and is still 23% cheaper per run. [role=checkbox]]:translate-y-[2px] font-bold" style="border-right:1px solid var(--neutral-gray)">Token category (Opus 4.7)[role=checkbox]]:translate-y-[2px]">Auggie CLI[role=checkbox]]:translate-y-[2px]">Claude Code[role=checkbox]]:translate-y-[2px]">Delta[role=checkbox]]:translate-y-[2px] font-bold" style="border-right:1px solid var(--neutral-gray)">Total tokens[role=checkbox]]:translate-y-[2px]">1,651,716,301[role=checkbox]]:translate-y-[2px]">2,349,143,356[role=checkbox]]:translate-y-[2px]">−30%[role=checkbox]]:translate-y-[2px] font-bold" style="border-right:1px solid var(--neutral-gray)">Cache read tokens[role=checkbox]]:translate-y-[2px]">1,582,841,271[role=checkbox]]:translate-y-[2px]">2,269,905,161[role=checkbox]]:translate-y-[2px]">−30%[role=checkbox]]:translate-y-[2px] font-bold" style="border-right:1px solid var(--neutral-gray)">Cache write...

Same Opus 4.7, 33% fewer tokens: where coding agent cost comes from

Related Articles

Elevated error rates on requests to multiple models

Donald Trump and sons to be 'forever' exempt from tax audits

PopuLoRA: Co-Evolving LLM Populations for Reasoning Self- Play

Old Reddit Is Down

The ultimate female fantasy – A feminist critique of Beauty and the Beast