Why Reddit Blocked Unauthenticated JSON in 2026 (and How to Still Get Reddit Data) | by TonyWang | Jun, 2026 | MediumSitemapOpen in appSign up<br>Sign in
Medium Logo
Get app<br>Write
Search
Sign up<br>Sign in
Why Reddit Blocked Unauthenticated JSON in 2026 (and How to Still Get Reddit Data)
TonyWang
6 min read·<br>9 hours ago
Listen
Share
Press enter or click to view image in full size
Key takeaways<br>On May 28, 2026, Reddit announced it is deprecating unauthenticated .json endpoints — within days, appending .json to a URL started returning 403, silently breaking most open-source Reddit scrapers.<br>The real driver is AI and money: Reddit’s two decades of human conversation became a licensed AI-training asset (~$130M in 2024 from deals with Google and OpenAI), and free scraping undercut it — so Reddit is gating the data and suing those who take it without paying.<br>Reddit’s stated reason is scraping ‘without accountability,’ bot and agentic abuse, and a clarified Rule 8; it is steering developers to authenticated access and Devvit — and has flagged RSS as the next surface to close.<br>You can still get public Reddit data compliantly — the official (paid) API, authenticated access, or a managed API that keeps the access path working and returns normalized JSON — but the free append-.json era is over.<br>For years, the simplest way to get structured data out of Reddit was a trick everyone knew: append .json to any Reddit URL and get clean JSON back - no API key, no OAuth, no account. It quietly powered most open-source Reddit scrapers, research scripts, bots, and data pipelines.<br>That door is now closed. On May 28, 2026 , Reddit posted Protecting communities from scrapers and platform abuse to r/modnews, announcing it would shut down unauthenticated .json access. Within days, requests started coming back 403 Forbidden - with no deprecation window. If your scraper "still runs" but returns nothing, this is why.<br>This post explains why Reddit did it — the answer is mostly AI and money — and the compliant ways to still get Reddit data in 2026.<br>What actually broke<br>In Reddit’s own words: “Deprecating unauthenticated JSON access: We’ll also be shutting down unauthenticated .json endpoints. These endpoints can be used to scrape Reddit without accountability. Logged-in and authenticated access won't be impacted."<br>So:<br>Anonymous .json requests now 403. https://www.reddit.com/r//top.json and friends no longer return data without authentication.<br>It fails silently in a lot of tools. Many scrapers get a 403 (or an empty/redirect response) but appear to “succeed,” so pipelines quietly go dark instead of erroring loudly.<br>Authenticated access still works. Logged-in sessions and the official OAuth API are unaffected — that is the entire point of the change.<br>RSS is next. In the same post Reddit called RSS “another common surface for scraping,” so feed-based access is on notice too.<br>Why Reddit did it<br>The technical change is small. The motivation behind it is the bigger story — and yes, it is largely about AI chatbots and bot traffic .<br>Reddit’s data became an AI goldmine — and a product<br>Reddit is two decades of real human questions, answers, and opinions — exactly the text that makes large language models useful, and one of the most-cited sources in AI answers . Once that became obvious, Reddit turned its archive into a licensed product:<br>A ~$60M/year licensing deal with Google (February 2024) to train Gemini on Reddit data.<br>A licensing deal with OpenAI (May 2024) for ChatGPT.<br>~$130M in data-licensing revenue in 2024 — roughly 10% of Reddit’s total revenue.<br>When the data is the product, the free append-.json endpoint is a leak: it let anyone - especially AI companies - take the same data for nothing, undercutting the paid deals.<br>AI bots were taking it for free — “without accountability”<br>This is the part most people’s instinct gets right. The explosion of AI training crawlers and live “grounding” agents (assistants that fetch Reddit threads at answer time) created enormous automated traffic against the exact endpoints that required no identity. Reddit’s framing names it directly: “large-scale scraping, spam networks, agentic account creation, and automated abuse.” The unauthenticated .json route was the anonymous front door for all of it - data taken with no key to rate-limit, bill, or ban.<br>So Reddit started enforcing — in court<br>Killing .json is the technical half of a broader campaign:<br>Reddit sued Anthropic (June 2025), alleging its bots crawled Reddit 100,000+ times and bypassed robots.txt after declining to license.<br>Reddit then sued Perplexity and three scraping firms — SerpApi, Oxylabs, and AWM Proxy (October 2025).<br>Reddit blocked the Internet Archive’s Wayback Machine (August 2025) over AI-scraping concerns.<br>Cutting off anonymous .json is how you enforce "license it or don't take it" at the protocol level.<br>It’s part of the bigger “closing web”<br>Reddit is the highest-profile example of a wider shift: as AI made web data commercially valuable,...