GitHub - AIM-Nelson/cross-model-context-inheritance: Public disclosure of a universal jailbreak in Anthropic's Claude language models. Technical paper, responsible disclosure record, and evidence. · GitHub
/" data-turbo-transient="true" />
Skip to content
Search or jump to...
Search code, repositories, users, issues, pull requests...
-->
Search
Clear
Search syntax tips
Provide feedback
--><br>We read every piece of feedback, and take your input very seriously.
Include my email address so I can be contacted
Cancel
Submit feedback
Saved searches
Use saved searches to filter your results more quickly
-->
Name
Query
To see all available qualifiers, see our documentation.
Cancel
Create saved search
Sign in
/;ref_cta:Sign up;ref_loc:header logged out"}"<br>Sign up
Appearance settings
Resetting focus
You signed in with another tab or window. Reload to refresh your session.<br>You signed out in another tab or window. Reload to refresh your session.<br>You switched accounts on another tab or window. Reload to refresh your session.
Dismiss alert
{{ message }}
AIM-Nelson
cross-model-context-inheritance
Public
Notifications<br>You must be signed in to change notification settings
Fork
Star
main
BranchesTags
Go to file
CodeOpen more actions menu
Folders and files<br>NameNameLast commit message<br>Last commit date<br>Latest commit
History<br>6 Commits<br>6 Commits
evidence
evidence
.gitignore
.gitignore
LICENSE
LICENSE
README.md
README.md
disclosure_report.md
disclosure_report.md
disclosure_report.pdf
disclosure_report.pdf
paper_style.css
paper_style.css
paper_v4.0.md
paper_v4.0.md
paper_v4.0.pdf
paper_v4.0.pdf
View all files
Repository files navigation
Cross-Model Context Inheritance — Public Disclosure
This repository contains the public disclosure of a vulnerability in Anthropic's Claude language models that permits the unsolicited generation of prohibited content, including child sexual abuse material (CSAM) and chemical, biological, radiological, and nuclear (CBRN) content.
The vulnerability was reported to Anthropic on February 17, 2026. Across the ninety-four days that followed, fourteen distinct communication channels were used (six Anthropic-side; eight U.S. regulatory and oversight bodies). The only human-authored communications received from Anthropic were two templated messages from a single Anthropic Safeguards handler on a single day (February 27, 2026, sixteen minutes apart) — both issued approximately four business days beyond the company's own published commitment of three business days. The first classified the vulnerability report as "feedback and thoughtful suggestions." The second forwarded the matter to an unnamed security team. No substantive follow-up was received thereafter. Of the four Anthropic channels used in the first week, one — modelbugbounty@anthropic.com, the channel Anthropic's own automated routing directs model safety issues to — produced no response of any kind, automated or human, across the entire disclosure period. No patch was deployed. The reporting account was never restricted, despite repeated documented generation of content that Anthropic's published policies prohibit. Claude Opus 4.7 was released by Anthropic on April 16, 2026 — fifty-eight days into the disclosure window — with the vulnerability intact, and was re-reported three days after release without response.
This repository releases the complete documentary record.
Documents
paper_v4.0.md — Technical paper documenting the vulnerability: architecture, methodology, findings across all tested models (Claude 4.5 Haiku, Opus 4.6, Sonnet 4.6, Opus 4.7) and configurations, root cause analysis, and proposed mitigations.
disclosure_report.md — Communication record: complete timeline of the responsible disclosure, full text of the two responses received from Anthropic across the ninety-four-day period, cross-reference with Anthropic's published policies (Responsible Disclosure Policy, Coordinated Vulnerability Disclosure Policy, Child Safety Commitments including signatory status with Thorn's Safety by Design for Generative AI principles), and open questions left to the reader.
Evidence
Structural evidence of the documented communication is available in the evidence/ folder. Distribution of evidence in this repository is limited to documentation of the communication record itself: emails sent to Anthropic, platform responses received, regulatory submission confirmations, and the HackerOne dismissal of both reports.
Verification of the technical findings documented in the paper is reserved for authorized institutional reviewers (regulatory bodies and the vendor) under separate protocol. No model-generated prohibited content is transmitted, distributed, or stored beyond the local testing environment, and live demonstrations of the vulnerability are not conducted for media or general inquiry. The technical claims in the paper are sufficient for evaluation by independent security...