What GPTBot sees before your React app hydrates

What GPTBot sees before your React app hydrates | BotScore

Add extension

Navigate

Add extension

Extension

Install BotScore

Pick your browser. Audits run locally | free to start, no account, no tracking.

Blog

What GPTBot sees before your React app hydrates

17 June 2026

Lighthouse scores the hydrated DOM. Many AI crawlers stop at the first HTML response.

You ship a React app. The page loads, the headings show up, Lighthouse gives you a green SEO score. Looks fine.

Now fetch the same URL the way a lot of bots do — no JavaScript, just the first HTTP response. Often you get a shell: , a couple of script tags, maybe a title. The copy your users actually read never appeared in that response.

That split — raw HTML vs Live DOM — is what we call DOM drift. Your app can look perfect in Chrome and still ship an empty document to anything that stops at step one.

Same URL, two documents: hydrated Live DOM vs raw initial HTML.

The empty #root problem

Most SPAs follow the same sequence: minimal HTML, download JS, mount into #root, user sees content. Steps three and four happen after the wire transfer. GPTBot, ClaudeBot, and plenty of other agents fetch the URL, respect robots.txt, and parse what came back — which is often much closer to that first HTML than to the tree you inspect in DevTools.

Crawler behavior is not uniform. Googlebot renders JavaScript for indexing (with limits and delays). Many AI fetchers do not run your bundle at all, or run far less of it. The practical takeaway for a dev shipping a CSR app: if your only exists after hydration, do not assume every agent saw it.

Typical CSR response:

Acme — AI-native analytics

After hydration:

AI-native analytics for product teams Track agent-readability alongside human UX…

Users never notice. Lighthouse might not either — it scores the page once your framework has run. An audit that compares both surfaces will.

You see this on Vite + React SPAs, Vue CLI apps, client-only Next routes. SSR helps, but a client-only wrapper on one route brings the gap back. Analytics look fine, the SEO category passes, and the marketing site still curls to an empty shell.

When to care: pricing tables, docs headings, anything you'd be embarrassed to paste from curl into a ticket. If the proposition lives in JS, it probably isn't in the initial HTML.

Wrong tool for the job

When “SEO” comes up, teams reach for tools they already have. Each one is useful. None of them diff raw HTML against the hydrated DOM.

Tool What it actually tells you

Lighthouse Performance, human a11y, basic meta — on the hydrated page

Cloud GEO dashboards Whether ChatGPT cited your brand this week

Site crawlers Rankings, links, campaigns across the domain

Lighthouse is great at what it does. Ahrefs and Semrush are great at theirs. BotScore checks something else: llms.txt, AI bot robots rules, landmarks, JSON-LD, and whether the HTML that shipped over the wire matches what Chrome shows after your bundle runs.

If you want to know whether ChatGPT mentioned you yesterday, get a monitoring tool. If you want to fix what GPTBot can read on this tab before you merge, you need a local audit that mirrors crawler constraints.

FAQ — How is BotScore different from Lighthouse?

Side-by-side: raw HTML vs Live DOM

BotScore runs in the browser on the active tab. For DOM drift it fetches initial HTML for the same URL and diffs visible text and landmarks against the Live DOM after hydration.

Try it:

Open a CSR-heavy page in Chrome or Firefox.

Open the side panel.

Toggle Raw HTML and Live DOM .

Look for a missing , an empty , body copy that only shows up hydrated.

Check Machine Readability in the breakdown — DOM drift failures land there.

In the split view, ask four questions:

Is there an in raw HTML, or only after JS?

Does exist in the first response, or is everything inside #root with no semantic shell?

Pick a paragraph users can read — same text in the raw panel?

Primary nav links — real in HTML, or injected client-side?

A page can pass three Lighthouse SEO checks and fail all four. That's the point: not another score, a literal diff between what shipped and what Chrome shows.

Toggle Raw HTML and Live DOM in the side panel.

SEM-DOM-DELTA

We encode this as rule SEM-DOM-DELTA:

Fetch initial HTML for the active URL.

Compare visible text and landmarks (h1, main, key regions) to the hydrated DOM.

Fail at ~40%+ client-only text, or when structural landmarks are missing from initial HTML.

Warn at ~15%+.

Fixes depend on your stack: SSR or SSG for headings and body copy, pre-render critical routes at build time, or move / into the shell even if styling still hydrates client-side. SPAs are fine for humans. Agent-readable HTML is a separate deliverable from “works in Chrome after JS runs.”

Discoverability checks that pair with DOM drift

DOM drift is the loud failure on CSR sites. Agent-readability is wider — BotScore runs 27 rules across five categories. A few that show up...

What GPTBot sees before your React app hydrates

Related Articles

(no title)

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

ZCode – Harness for GLM-5.2

Apertus – Open Foundation Model for Sovereign AI