Subquadratic — Efficiency is Intelligence<br>Contact salesRequest early access →
The first modelbuilt for long‑context tasks<br>SubQ is a sub-quadratic LLM built for multi-million token reasoning, allowing agents to work across full repositories, long histories, and persistent state without quality loss.<br>Request early access →
Use Cases<br>All your context. Always available.<br>Reason across millions of tokens in one prompt: entire repos, whole artifacts, and long-running agent state, with room to spare at a fraction of the cost.<br>Tokens012M<br>Python source code<br>The entire 3.13 standard library
~5.1M
Six months of React PRs<br>~1,050 pull requests against the React codebase
~7.5M
~ Approximate token counts.
Architecture<br>Not just another model.An architectural breakthrough.<br>SubQ is the first model built on a fully sub-quadratic sparse-attention architecture. LLMs today waste compute by processing every possible relationship between words, but only a small fraction of these relationships matter.<br>SubQ finds and focuses only on those, ensuring compute is used where it matters most. At 12M tokens, this reduces attention compute almost 1,000×, changing the way LLMs scale.<br>Technical report →
TransformerO(n²)<br>SubQO(n)
Benchmarks<br>A leader in long-context retrieval and reasoning tasks<br>Long context retrieval<br>SubQ has near-perfect performance on single-fact retrieval and multi-task retrieval, both at scale.<br>Multi-task retrievalRULER (128K)<br>99.12%
128K
Single-fact retrievalNeedle-in-a-haystack (1M–12M)<br>100%
1M
100%
2M
98%
6M
98%
12M
Reasoning & knowledge<br>SubQ balances long-context retrieval without compromising on reasoning and knowledge.<br>BenchmarkSubQ 1.1 SmallGPT-5.5Opus 4.8Sonnet 4.6GPT-5.4-miniGPT-5.4-nanoHaiku 4.5Graduate-level science<br>GPQA Diamond · pass@1<br>85.493.29287.587.581.767.2Agentic finance<br>AutomationBench<br>13%18%16%8%0%n/r3%Competitive programming<br>LiveCodeBench v6 · pass@4<br>89.79292.288.978.678.269.7<br>n/r = result not reported by the model provider
Unrivaled efficiency<br>SubQ uses 64.5x less compute than dense attention, and is 56× faster than FlashAttention-2 at 1M-token context.
Third-party validated results →Technical report →
Products<br>Two ways to use SubQ.<br>The full-context API for developers and enterprise teams. Process full repositories and pipeline states in a single API call at linear cost.<br>→ 12M token context window<br>→ Streaming + tool use<br>→ OpenAI-compatible endpoints<br>Request API access →
The long-context layer for coding agents. Plug into Claude Code, Codex, and Cursor to map codebases, gather context, and answer token-heavy questions faster.<br>→ Auto-redirects expensive model turns<br>→ One-line install<br>Request SubQ Code access →
Research<br>From the lab.
ResearchJune 16, 2026<br>Introducing SubQ 1.1 Small<br>Read more →<br>PartnershipsMay 14, 2026<br>We're Partnering with LayerLens to Evaluate SubQ<br>Read more →<br>ProductMay 5, 2026<br>Introducing SubQ: The First Fully Subquadratic LLM<br>Read more →<br>ResearchUpdated May 15, 2026<br>How SSA Makes Long Context Practical<br>Read more →
About<br>We built the architecture the industry said wasn't possible.<br>Subquadratic is a frontier AI research and infrastructure company building a new class of LLMs. While other major labs focus on incremental improvements to Transformer models, we're pushing foundational change at the model architecture level — enabling large-context, multi-modal inference that scales efficiently where transformers can't.<br>Built by researchers from<br>Meta<br>Google<br>Oxford<br>Cambridge<br>BYU
Early Access<br>Is your business ready?<br>Build with us.<br>Join the private preview.
Send MessageBY SUBMITTING THIS FORM, YOU AGREE TO OUR PRIVACY POLICY AND CONSENT TO RECEIVE MARKETING COMMUNICATIONS FROM SUBQUADRATIC. YOU CAN UNSUBSCRIBE AT ANY TIME.