DeepSeek V4, R1, Qwen 3.6, Kimi K2.6 API · 20% cheaper · QuickSilver Pro
Launch bonusWe match 100% of your first credit purchase — up to $50 freeOpen-source inference,20% below the rest.<br>The 7 most popular open-source models — DeepSeek V4 Flash & Pro, V3, R1, Qwen 3.6 & 3.5, Kimi K2.6 — through an OpenAI-compatible API. Cheaper than every other reseller. Change one line of code.<br>Get API KeyView Pricing<br>or try the models live on HuggingFace — no signup required.<br>No subscription<br>OpenAI compatible<br>Pay as you go<br>Drop-in withOpenAI SDK·<br>Aider·<br>Cursor·<br>Cline·<br>Continue.dev·<br>LangChain·<br>Vercel AI SDK
pythonCopy<br>1# One line change. That's it.<br>2from openai import OpenAI<br>4client = OpenAI(<br>5 base_url="https://api.quicksilverpro.io/v1",<br>6 api_key="your-api-key",<br>7)
Model<br>Context<br>Input<br>Output<br>Savings
DeepSeekDeepSeek V4 FlashNew<br>deepseek-v4-flashfast chat & coding, 1M context, thinking on by default
1M<br>$0.11$0.14<br>$0.22$0.28<br>−21%
DeepSeekDeepSeek V4 ProNew<br>deepseek-v4-propremium reasoning, 1M context
1M<br>$0.35$0.435<br>$0.70$0.87<br>−20%
DeepSeekDeepSeek V3<br>deepseek-v3chat, coding, structured output
128K<br>$0.24$0.30<br>$0.70$0.88<br>−20%
DeepSeekDeepSeek R1Reasoning<br>deepseek-r1math, multi-step reasoning, logic
128K<br>$0.40$0.50<br>$1.70$2.15<br>−20%
QwenQwen3.6-35B-A3BNew<br>qwen3.6-35blong-context RAG, drop-in 3.5 upgrade
262K<br>$0.13$0.16<br>$0.78$0.97<br>−19%
QwenQwen3.5-35B-A3B<br>qwen3.5-35blong-context RAG, summarization
262K<br>$0.13$0.16<br>$1.00$1.25<br>−20%
KimiKimi K2.6<br>kimi-k2.6Opus-class agentic / planning
256K<br>$0.60$0.74<br>$3.73$4.66<br>−20%
GeminiGemini 2.5 FlashNew<br>gemini-2.5-flashmultimodal chat, 1M context
1M<br>$0.255$0.30<br>$2.125$2.50<br>−15%
GeminiGemini 2.5 Flash ImageNew<br>gemini-2.5-flash-imageimage generation
1M<br>$0.255$0.30<br>$25.50$30.00<br>−15%
GeminiGemini 2.5 Flash LiteNew<br>gemini-2.5-flash-litehigh-volume cheap tasks
1M<br>$0.085$0.10<br>$0.34$0.40<br>−15%
GeminiGemini 3 Flash PreviewNew<br>gemini-3-flash-previewnext-gen flash reasoning
1M<br>$0.425$0.50<br>$2.55$3.00<br>−15%
GeminiGemini 3 Pro Image PreviewNew<br>gemini-3-pro-image-previewpro-grade image generation
1M<br>$1.70$2.00<br>$102.00$120.00<br>−15%
GeminiGemini 3.1 Pro PreviewNew<br>gemini-3.1-pro-previewflagship reasoning, 1M context
1M<br>$1.70$2.00<br>$10.20$12.00<br>−15%
GeminiGemini 3.5 FlashNew<br>gemini-3.5-flashnext-gen Flash GA, 1M context
1M<br>$1.275$1.50<br>$7.65$9.00<br>−15%
Compared against OpenRouter, Together AI, and Fireworks AI. Prices as of April 2026.<br>Side-by-side pricing vs every competitor<br>vs OpenRouter<br>20% cheaper<br>vs Together AI<br>76% on R1<br>vs Fireworks<br>79% on R1<br>vs DeepInfra<br>Lower list<br>vs OpenAI<br>Up to 35x
Coding<br>DeepSeek V3 for tool-calling agents →<br>Reasoning<br>DeepSeek R1 for math & algorithms →<br>Long context<br>Qwen3.5-35B-A3B for 262K RAG →
See all comparisons →
DeepSeekDeepSeek V4 Flash<br>1M ctx, thinks by default, ~50% cheaper than V3
DeepSeekDeepSeek V4 Pro<br>premium reasoning, 1M context
DeepSeekDeepSeek V3<br>general chat, coding, tool calling
DeepSeekDeepSeek R1<br>reasoning, math, o1-equivalent
QwenQwen3.6-35B-A3B<br>262K long-context, MoE upgrade
QwenQwen3.5-35B-A3B<br>262K long-context, RAG
KimiKimi K2.6<br>Opus-class reasoning, 256K
GeminiGemini 2.5 Flash<br>1M context, multimodal, thinking
GeminiGemini 2.5 Flash Image<br>1M context, image generation
GeminiGemini 2.5 Flash Lite<br>cheapest Gemini, 1M context
GeminiGemini 3 Flash Preview<br>next-gen flash, 1M context
GeminiGemini 3 Pro Image Preview<br>pro image generation
GeminiGemini 3.1 Pro Preview<br>flagship reasoning, 1M context
GeminiGemini 3.5 Flash<br>next-gen Flash GA, 1M context
Common totals (10:1 input/output):1M10M100M<br>Thinking model — output token counts include the reasoning trace, which is typically 3-10× the visible reply.<br>Input tokens / month1M
Output tokens / month300K
QuickSilver Pro<br>$0.18cheapest
OpenRouter<br>$0.22+27%
OpenAIclosed model analog<br>$0.33+87%
QSP saves 5¢/month vs OpenRouter (21% cheaper).
CLIqsp<br>Built for terminals and AI agents. --json output with stable exit codes — Claude Code, Cursor, Aider can call it without parsing HTML.<br>PyPIGitHubQuickstart →
What is QuickSilver Pro?An OpenAI-compatible HTTP API for 7 top open-source LLMs — DeepSeek V4 Flash & Pro, V3, R1, Qwen 3.6 & 3.5-35B-A3B, and Kimi K2.6. Point the official OpenAI SDK at our base URL and get the same chat-completions interface, 20% below competing resellers.
What's the difference between V3 and V4 Flash?V4 Flash is DeepSeek's newest model (released April 2026): ~50% cheaper output than V3, 1M context vs 128K, and thinks by default (chain-of-thought reasoning) — so a one-token "Hi" can return ~175 reasoning tokens. For V3-style cheap chat without the thinking overhead, pass `reasoning: { enabled: false }` in the request body. Existing V3 keeps working unchanged.
How much cheaper than OpenRouter / OpenAI?20% below the public per-token rates at OpenRouter, Together AI, Fireworks AI, and DeepInfra on the same open-source models. V4 Flash: $0.11 / $0.22. V4 Pro: $0.35 / $0.70. V3: $0.24 / $0.70. R1: $0.40 / $1.70. Qwen 3.6: $0.13 / $0.78. Qwen 3.5: $0.13 / $1.00....