Semantic Manifest – An open specification for AI crawler ingestion

CKL751 pts0 comments

GitHub - CKL75/semantic-manifest-specification: An open data standard and streamable content graph specification (NDJSON) designed to optimize website discovery, content relations, and text ingestion for AI crawlers, LLMs, and RAG engines. · GitHub

/" data-turbo-transient="true" />

Skip to content

Search or jump to...

Search code, repositories, users, issues, pull requests...

-->

Search

Clear

Search syntax tips

Provide feedback

--><br>We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Cancel

Submit feedback

Saved searches

Use saved searches to filter your results more quickly

-->

Name

Query

To see all available qualifiers, see our documentation.

Cancel

Create saved search

Sign in

/;ref_cta:Sign up;ref_loc:header logged out"}"<br>Sign up

Appearance settings

Resetting focus

You signed in with another tab or window. Reload to refresh your session.<br>You signed out in another tab or window. Reload to refresh your session.<br>You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

{{ message }}

CKL75

semantic-manifest-specification

Public

Notifications<br>You must be signed in to change notification settings

Fork

Star

main

BranchesTags

Go to file

CodeOpen more actions menu

Folders and files<br>NameNameLast commit message<br>Last commit date<br>Latest commit

History<br>2 Commits<br>2 Commits

LICENSE.md

LICENSE.md

README.md

README.md

semantic-manifest-specification.md

semantic-manifest-specification.md

semantic-manifest.jsonl

semantic-manifest.jsonl

View all files

Repository files navigation

The Semantic Manifest Specification

An open data standard and streamable content graph specification (NDJSON) designed to optimize website discovery, content relations, and text ingestion for AI crawlers, LLMs, and RAG engines.

The Problem

Traditional web standards are fundamentally broken for AI search engines at scale.

Sitemaps pass raw URLs without structural context.

JSON-LD is trapped in single-page scopes.

llms.txt files are flat text that consume massive amounts of context window tokens when scaling to thousands of pages.

The Solution

The Semantic Manifest bridges this gap. It uses a streamable NDJSON format so AI crawlers can parse an entire site's content types, relational entities, and explicitly designated "markdown twins" line-by-line efficiently.

System Performance Demo

Within hours of launching a ~58,000-page site with a Semantic Manifest linked in the , ClaudeBot ingested the entire site at ~7 URLs per second.

📺 Watch the System Performance & Crawl Log Demo Video Here

Core Documentation

📖 Read the Full Specification (v0.1)

📄 View the Reference JSONL

Reference Implementations

This standard is a native, built-in structural component of the High-Velocity Content Engine (HVCE).

Live production manifests can be viewed at:

EduStats (58,000 pages): edustats.app/semantic-manifest.jsonl.

Hypersonic SEO Framework: hypersonicseo.com/semantic-manifest.jsonl

License & Authorship

Semantic Manifest Specification (v0.1) © 2026 by Chris Limner / Hypersonic SEO.<br>This work is marked with CC0 1.0 Universal.

About

An open data standard and streamable content graph specification (NDJSON) designed to optimize website discovery, content relations, and text ingestion for AI crawlers, LLMs, and RAG engines.

Resources

Readme

License

View license

Uh oh!

There was an error while loading. Please reload this page.

Activity

Stars

stars

Watchers

watching

Forks

forks

Report repository

Releases

No releases published

Packages

Uh oh!

There was an error while loading. Please reload this page.

Contributors

Uh oh!

There was an error while loading. Please reload this page.

You can’t perform that action at this time.

semantic manifest specification content search reload

Related Articles