Open Source LLM Statistics and Trends (2026)

sherlockxu1 pts0 comments

30+ Open Source LLM Statistics & Trends (2026) — OpenLLMStack<br>Back to blog

Open source LLMs went from a research curiosity to the backbone of real production systems in under three years. They now power coding assistants, agents, and enterprise pipelines, and in some fields they’ve overtaken the proprietary models that defined the early days of the boom.

So how big is the open source LLM ecosystem today? And how fast is it actually growing?

I pulled together the most useful open source LLM statistics I could find, all from primary sources like the Stanford AI Index, Hugging Face, Meta, OpenRouter, and the Stack Overflow Developer Survey. Each section links to the original sources so you can cite them directly.

Top open source LLM statistics

This article prioritizes first-party reports, research papers, official documentation, and original datasets.

StatisticValueScope and dateSourcePublic models hosted on Hugging FaceMore than 2 million All public model repositories, not only LLMs; 2025Hugging FaceDownloads captured by the top 200 models49.6% Hugging Face downloads; 2025Hugging FaceDirect derivative models in the Qwen familyMore than 113,000 Hugging Face repositories; March 2026Hugging FaceLlama downloadsMore than 1 billion Cumulative downloads reported in March 2025MetaOpen source share of token usageRoughly one-third OpenRouter, late 2025OpenRouterTokens processed by DeepSeek models14.37 trillion OpenRouter, November 2024-November 2025OpenRouterOpen-vs.-closed performance gap3.3% Top Arena models, March 2026Stanford AI Index 2026OpenRouter annualized token run rateAbout 1.5 quadrillion All models on OpenRouter, May 2026Menlo VenturesHistorical inference cost declineAbout 10x per year Equivalent MMLU performance in a 2024 analysisAndreessen Horowitz<br>What is an open source LLM?

An open source LLM generally refers to a language model that people can download, run, and modify. Compared with proprietary models that are only accessible through an API, open source LLMs give developers much greater control over deployment, customization, and infrastructure.

The term open source is often used loosely. Many models described as open source are actually released as open-weight models. Their weights are publicly available but the license may include restrictions that differ from a traditional open source software license. Because the industry commonly uses “open source LLM” to refer to both categories, this article follows that convention for simplicity.

How many open source LLMs are there?

There is no authoritative global count of open source LLMs. Hugging Face hosted more than 2 million public model repositories in 2025, but that total includes models for text, image, audio, robotics, and other tasks, as well as fine-tunes, adapters, quantizations, and derivatives.

The broader Hugging Face ecosystem is still useful for measuring the scale and direction of open model development:

Hugging Face statisticValueWhat it measuresRegistered users13 millionPlatform community sizePublic model repositoriesMore than 2 millionAll model types and derivativesPublic datasetsMore than 500,000All dataset categoriesModels with fewer than 200 downloadsAbout 50%The long tail of model repositoriesShare of downloads captured by the top 200 models49.6%Concentration among the most-used repositoriesFortune 500 companies with verified accountsMore than 30%Organizational presence, not confirmed production adoptionIndustry share of model development37%Down from roughly 70% before 2022Downloads attributed to unaffiliated developers39%Up from 17% before 2022Mean size of a downloaded model20.8B parametersUp from 827M in 2023Median size of a downloaded model406M parametersUp from 326M in 2023Mean engagement period after releaseAbout 6 weeksHow long models typically sustain attention<br>Source: Hugging Face

The mean downloaded model size grew by about 25x between 2023 and 2025, while the median grew by only about 25% . That difference suggests a relatively small number of large models are pulling up the average while small models remain common.

Open source LLM adoption and usage

Open models have moved well past experimentation. By late 2025, they reached roughly one-third of all token volume on OpenRouter , a unified API platform that gives developers access to hundreds of AI models. The underlying study analyzed more than 100 trillion tokens over a year, covering November 2024 through November 2025.

Open model developerTokens processed on OpenRouterDeepSeek14.37 trillionQwen5.59 trillionMeta Llama3.96 trillionMistral AI2.92 trillionOpenAI1.65 trillionMiniMax1.26 trillionZ.ai1.18 trillionTNGTech1.13 trillionMoonshot AI0.92 trillionGoogle0.82 trillion<br>Source: OpenRouter

DeepSeek processed about 2.6 times as many tokens as Qwen , the second-largest open source family in the study. By late 2025, however, no individual model consistently accounted for more than 20%-25% of open source model tokens. Usage had spread across five to seven...

open source models model hugging from

Related Articles