AI Now Writes as Many Online Articles as Humans

ChrisArchitect1 pts0 comments

AI Now Writes as Many Online Articles as Humans Do

Key Takeaways<br>The number of articles published on the internet that are primarily AI-generated (50%) is equal to the number written by humans (50%).<br>ChatGPT launched in November 2022. Within the first 12 months, the percentage of primarily AI-generated articles jumped to 36%, and reached 48% by 24 months.<br>However, since Q1 2025 the percentage of primarily AI-generated articles has plateaued at roughly 50%. We previously published this finding with data up to May 2025, and new data confirms this trend.<br>We build on our prior research by using three different AI detectors (Pangram, GPTZero, Copyleaks). We independently evaluate each to show that the false positive rates and average false negative rates are consistently below 2%. Each AI detector shows a similar trend.<br>While the trend is the same, our previous study estimated the proportion of primarily AI-generated articles to be 3.3 percentage points higher. This relatively small difference is the result of averaging three AI detectors rather than relying on the accuracy of a single detector.<br>Despite the prevalence of AI-generated articles on the web, we show in a separate study that these articles largely do not appear in Google and ChatGPT. We do not evaluate whether AI-generated articles get as much traffic as human-written articles, but we suspect that they do not.<br>Motivation<br>Since ChatGPT launched in November 2022, many companies have explored publishing content generated by LLMs such as ChatGPT, Claude, and Gemini to grow their traffic across channels such as Google Search, social, and advertising. This is a cost-effective alternative to spending hundreds of dollars for humans to write content.<br>The quality of AI content is rapidly improving.  In many cases, AI-generated content is as good or better than content written by humans (MIT Study). It is often hard for people to distinguish whether content is created by AI (Originality.ai Study).<br>We seek to evaluate the prevalence of AI-generated articles.<br>Results<br>We observe significant growth in primarily AI-generated articles, coinciding with the launch of ChatGPT in November 2022. After only 12 months, primarily AI-generated articles accounted for 35.9% of articles published.<br>In Q1 2025, the quantity of primarily AI-generated articles being published on the web nearly equaled the quantity of human-written articles, 49.6% vs. 50.4%. In Q4 2025, primarily AI-generated articles surpassed human-written at 50.9%, before returning to 49.9% in Q1 2026.

Source Data<br>Primarily AI-Generated Article Growth Has Plateaued<br>While primarily AI-generated articles grew dramatically after ChatGPT launched, we do not see that trend continuing. Instead, the proportion of primarily AI-generated articles has remained relatively stable, near 50%, over the last five quarters. We hypothesize that this is because practitioners found that primarily AI-generated articles do not perform well in search, as shown in a separate study.<br>Methodology<br>Common Crawl<br>Common Crawl maintains one of the largest publicly available web archives. It contains billions of pages and is used by researchers and developers. It is a key data source for training large language models.<br>Selection of Articles<br>We need a representative sample of English-language articles on the web. While Common Crawl does not crawl every page, its archive is the best free and publicly available proxy for the web. We want to measure the proportion of all articles being published that are primarily AI-generated, so we do not filter by traffic or use a curated subset. We randomly select 55.4k URLs from Common Crawl, and confirm that each is in English, has an article schema markup, is at least 100 words, has a publish date between January 2020 and March 2026, and is an article or listicle as classified by the Graphite page type classifier.<br>AI Detection<br>We classify each article using three AI detectors: Pangram, Copyleaks, and GPTZero. The AI detectors produce different outputs. We provide the output of each detector, and how we transform that output into a binary, primarily AI / primarily human classification below.<br>Pangram and Copyleaks provide the proportion of the article’s content that is AI-generated.<br>Pangram<br>Output: Proportion of the article that is Human, AI-assisted, AI<br>Classify as primarily AI if: proportion AI + proportion AI-assisted > proportion Human<br>Copyleaks<br>Output: Proportion of the article that is Human, AI<br>Classify as primarily AI if: proportion AI > proportion Human<br>In contrast, GPTZero provides an article-level prediction. (Its Advanced Sentence Scanning output includes sentences that most impact the classification, but it does not directly provide the proportion of AI-generated content. We prefer to use its article-level output rather than devising our own method for computing the proportions.)<br>GPTZero<br>Output: Prediction (Human, Mixed, AI) and confidence score<br>Classify as primarily AI if: prediction is AI or...

articles generated primarily proportion human article

Related Articles