OrcaRouter — Frontier AI quality at open-source prices
Quality-graded routing — not a blind cheap-model swap.<br>Frontier quality. Open-source prices.<br>OrcaRouter grades every prompt and routes it — hard reasoning to frontier models, routine work to open-source — so you keep frontier answer quality while cutting spend ~40%. Quality-checked, never a blind downgrade. Zero token markup. One line to switch.<br>Get your API key →Browse models<br>PythonTypeScriptcURL<br>- client = OpenAI(api_key="sk-...")<br>+ client = OpenAI(<br>+ base_url="https://api.orcarouter.ai/v1",<br>+ api_key="sk-orca-..."<br>+ )
# Everything else stays the same.<br>response = client.chat.completions.create(<br>model="orcarouter/auto", # router picks the best model per request<br>messages=[{"role": "user", "content": "..."}]<br># → orcarouter/auto grades the prompt → frontier or open-source, zero token markup ✓
claude-opus-4-7$5.00 in·$25.00 outAnthropic Direct<br>claude-sonnet-4-6$3.00 in·$15.00 outAnthropic Direct<br>gpt-5.5$5.00 in·$30.00 outOpenAI Direct<br>gemini-3.1-pro-preview$4.00 in·$18.00 outGoogle Direct<br>deepseek-v4-pro$0.560 in·$1.12 outDeepSeek<br>qwen3.6-plus$0.500 in·$3.00 outAlibaba Cloud<br>kimi-k2.6$0.900 in·$3.75 outMoonshot<br>seedance-2.0from $0.07 /sec·—ByteDance<br>claude-opus-4-7$5.00 in·$25.00 outAnthropic Direct<br>claude-sonnet-4-6$3.00 in·$15.00 outAnthropic Direct<br>gpt-5.5$5.00 in·$30.00 outOpenAI Direct<br>gemini-3.1-pro-preview$4.00 in·$18.00 outGoogle Direct<br>deepseek-v4-pro$0.560 in·$1.12 outDeepSeek<br>qwen3.6-plus$0.500 in·$3.00 outAlibaba Cloud<br>kimi-k2.6$0.900 in·$3.75 outMoonshot<br>seedance-2.0from $0.07 /sec·—ByteDance
0%<br>routing markup. ever.
~40%<br>less spend — same answer quality
200+<br>models, one endpoint
routing overhead added
Building on this? Talk to us.<br>Feedback shapes the next ship.
XDiscordGitHub★
How routing works<br>Routing you can see and control.
Graded, then direct<br>Every prompt is graded, then sent straight to the chosen model's first-party provider — no reseller in the middle. The grade, model, and provider are shown on each request, so a route is never a black box.
Provider terms apply end-to-end<br>Each upstream provider's data and usage terms apply directly to your traffic. Pin routing to the models and providers that match your policy.
🧾<br>Per-request auditability<br>Every call records the difficulty grade, the model chosen, the provider, and the published price. Reproduce any routing decision later from the dashboard.
Setup<br>Live in 60 seconds.<br>One URL change. Your existing SDK, model names, and streaming all work exactly as before.
Step 1<br>🔗<br>Point your SDK at us<br>Set base_url to api.orcarouter.ai/v1 and swap your API key. No other code changes needed.
Step 2<br>We grade & route<br>Each prompt is graded for difficulty in under 1ms, then sent to the model that holds answer quality at the lowest cost — frontier for hard reasoning, open-source for routine work.
Step 3<br>You pay provider cost<br>Whichever model is chosen, traffic goes direct to its first-party provider at their published rate. We add exactly $0 per token — our fee is on the plan, not your usage.
Live pricing<br>Every model.<br>Best available rate.<br>Quality-graded routing to the right model — frontier or open-source. Live prices refresh every 60s.
View all 200+ models →
ModelRouted toInput /MOutput /MContextQualityclaude-opus-4-7Anthropic Direct$5.00$25.001M10.0claude-sonnet-4-6Anthropic Direct$3.00$15.001M7.0gpt-5.5OpenAI Direct$5.00$30.001M10.0gemini-3.1-pro-previewGoogle Direct$4.00$18.001M10.0deepseek-v4-proDeepSeek$0.560$1.121M9.0qwen3.6-plusAlibaba Cloud$0.500$3.001M8.0kimi-k2.6Moonshot$0.900$3.75256K9.0seedance-2.0ByteDancefrom $0.07 /sec——10.0+ 194 more models · Prices update every 60 seconds
Platform<br>Production-grade from day one.<br>Everything you need to run AI in production without managing multiple provider integrations.
Quality-graded routing<br>Every prompt is graded and sent to the model that holds quality at the lowest cost — frontier when it's hard, open-source when it's routine. Automatic.
Automatic failover<br>Provider goes down mid-stream? We switch transparently. Your app sees zero errors.
🔑<br>API key management<br>Issue keys per team or service with spend caps, model allowlists, and rate limits built in.
Per-request cost tracking<br>See what every request cost, the model and provider that handled it, and how much grading saved you.
OpenAI-compatible<br>Change one line. Same SDK, same model names, same streaming format. Zero migration effort.
🛡<br>Budget enforcement<br>Hard and soft limits per key, team, or org. Auto-resets monthly. Slack + webhook alerts.
Differentiator<br>Glass-box routing.<br>Every request shows the difficulty grade, the model chosen, the provider that served it, and the published price. Verifiable per call, reproducible later.
🔍<br>Per-request route attribution<br>Each completion is tagged with the difficulty grade, the model picked, and its first-party provider — Anthropic Direct, OpenAI Direct, Bedrock, Vertex — surfaced in your dashboard and headers.
📒<br>Published-rate...