QuickSilver Pro – OpenAI-Compatible Platform for DeepSeek V4 and Qwen

charlei1 pts1 comments

DeepSeek V4, R1, Qwen 3.6, Kimi K2.6 API · 20% cheaper · QuickSilver Pro

Launch bonusWe match 100% of your first credit purchase — up to $50 freeOpen-source inference,20% below the rest.<br>The 7 most popular open-source models — DeepSeek V4 Flash & Pro, V3, R1, Qwen 3.6 & 3.5, Kimi K2.6 — through an OpenAI-compatible API. Cheaper than every other reseller. Change one line of code.<br>Get API KeyView Pricing<br>or try the models live on HuggingFace — no signup required.<br>No subscription<br>OpenAI compatible<br>Pay as you go<br>Drop-in withOpenAI SDK·<br>Aider·<br>Cursor·<br>Cline·<br>Continue.dev·<br>LangChain·<br>Vercel AI SDK

pythonCopy<br>1# One line change. That's it.<br>2from openai import OpenAI<br>4client = OpenAI(<br>5 base_url="https://api.quicksilverpro.io/v1",<br>6 api_key="your-api-key",<br>7)

Model<br>Context<br>Input<br>Output<br>Savings

DeepSeekDeepSeek V4 FlashNew<br>deepseek-v4-flashfast chat & coding, 1M context, thinking on by default

1M<br>$0.11$0.14<br>$0.22$0.28<br>−21%

DeepSeekDeepSeek V4 ProNew<br>deepseek-v4-propremium reasoning, 1M context

1M<br>$0.35$0.435<br>$0.70$0.87<br>−20%

DeepSeekDeepSeek V3<br>deepseek-v3chat, coding, structured output

128K<br>$0.24$0.30<br>$0.70$0.88<br>−20%

DeepSeekDeepSeek R1Reasoning<br>deepseek-r1math, multi-step reasoning, logic

128K<br>$0.40$0.50<br>$1.70$2.15<br>−20%

QwenQwen3.6-35B-A3BNew<br>qwen3.6-35blong-context RAG, drop-in 3.5 upgrade

262K<br>$0.13$0.16<br>$0.78$0.97<br>−19%

QwenQwen3.5-35B-A3B<br>qwen3.5-35blong-context RAG, summarization

262K<br>$0.13$0.16<br>$1.00$1.25<br>−20%

KimiKimi K2.6<br>kimi-k2.6Opus-class agentic / planning

256K<br>$0.60$0.74<br>$3.73$4.66<br>−20%

GeminiGemini 2.5 FlashNew<br>gemini-2.5-flashmultimodal chat, 1M context

1M<br>$0.255$0.30<br>$2.125$2.50<br>−15%

GeminiGemini 2.5 Flash ImageNew<br>gemini-2.5-flash-imageimage generation

1M<br>$0.255$0.30<br>$25.50$30.00<br>−15%

GeminiGemini 2.5 Flash LiteNew<br>gemini-2.5-flash-litehigh-volume cheap tasks

1M<br>$0.085$0.10<br>$0.34$0.40<br>−15%

GeminiGemini 3 Flash PreviewNew<br>gemini-3-flash-previewnext-gen flash reasoning

1M<br>$0.425$0.50<br>$2.55$3.00<br>−15%

GeminiGemini 3 Pro Image PreviewNew<br>gemini-3-pro-image-previewpro-grade image generation

1M<br>$1.70$2.00<br>$102.00$120.00<br>−15%

GeminiGemini 3.1 Pro PreviewNew<br>gemini-3.1-pro-previewflagship reasoning, 1M context

1M<br>$1.70$2.00<br>$10.20$12.00<br>−15%

GeminiGemini 3.5 FlashNew<br>gemini-3.5-flashnext-gen Flash GA, 1M context

1M<br>$1.275$1.50<br>$7.65$9.00<br>−15%

Compared against OpenRouter, Together AI, and Fireworks AI. Prices as of April 2026.<br>Side-by-side pricing vs every competitor<br>vs OpenRouter<br>20% cheaper<br>vs Together AI<br>76% on R1<br>vs Fireworks<br>79% on R1<br>vs DeepInfra<br>Lower list<br>vs OpenAI<br>Up to 35x

Coding<br>DeepSeek V3 for tool-calling agents →<br>Reasoning<br>DeepSeek R1 for math & algorithms →<br>Long context<br>Qwen3.5-35B-A3B for 262K RAG →

See all comparisons →

DeepSeekDeepSeek V4 Flash<br>1M ctx, thinks by default, ~50% cheaper than V3

DeepSeekDeepSeek V4 Pro<br>premium reasoning, 1M context

DeepSeekDeepSeek V3<br>general chat, coding, tool calling

DeepSeekDeepSeek R1<br>reasoning, math, o1-equivalent

QwenQwen3.6-35B-A3B<br>262K long-context, MoE upgrade

QwenQwen3.5-35B-A3B<br>262K long-context, RAG

KimiKimi K2.6<br>Opus-class reasoning, 256K

GeminiGemini 2.5 Flash<br>1M context, multimodal, thinking

GeminiGemini 2.5 Flash Image<br>1M context, image generation

GeminiGemini 2.5 Flash Lite<br>cheapest Gemini, 1M context

GeminiGemini 3 Flash Preview<br>next-gen flash, 1M context

GeminiGemini 3 Pro Image Preview<br>pro image generation

GeminiGemini 3.1 Pro Preview<br>flagship reasoning, 1M context

GeminiGemini 3.5 Flash<br>next-gen Flash GA, 1M context

Common totals (10:1 input/output):1M10M100M<br>Thinking model — output token counts include the reasoning trace, which is typically 3-10× the visible reply.<br>Input tokens / month1M

Output tokens / month300K

QuickSilver Pro<br>$0.18cheapest

OpenRouter<br>$0.22+27%

OpenAIclosed model analog<br>$0.33+87%

QSP saves 5¢/month vs OpenRouter (21% cheaper).

CLIqsp<br>Built for terminals and AI agents. --json output with stable exit codes — Claude Code, Cursor, Aider can call it without parsing HTML.<br>PyPIGitHubQuickstart →

What is QuickSilver Pro?An OpenAI-compatible HTTP API for 7 top open-source LLMs — DeepSeek V4 Flash & Pro, V3, R1, Qwen 3.6 & 3.5-35B-A3B, and Kimi K2.6. Point the official OpenAI SDK at our base URL and get the same chat-completions interface, 20% below competing resellers.

What's the difference between V3 and V4 Flash?V4 Flash is DeepSeek's newest model (released April 2026): ~50% cheaper output than V3, 1M context vs 128K, and thinks by default (chain-of-thought reasoning) — so a one-token "Hi" can return ~175 reasoning tokens. For V3-style cheap chat without the thinking overhead, pass `reasoning: { enabled: false }` in the request body. Existing V3 keeps working unchanged.

How much cheaper than OpenRouter / OpenAI?20% below the public per-token rates at OpenRouter, Together AI, Fireworks AI, and DeepInfra on the same open-source models. V4 Flash: $0.11 / $0.22. V4 Pro: $0.35 / $0.70. V3: $0.24 / $0.70. R1: $0.40 / $1.70. Qwen 3.6: $0.13 / $0.78. Qwen 3.5: $0.13 / $1.00....

flash context geminigemini reasoning deepseek openai

Related Articles