Agent-memory systems admit poisoned facts – a reproducible benchmark

GitHub - arsenis-cmd/clai-benchmarks: Governed continual-learning memory for AI agents — rejects poisoned facts, derives unstored relationships. Reproducible head-to-head benchmarks. · GitHub

/" data-turbo-transient="true" />

Search or jump to...

Search code, repositories, users, issues, pull requests...

-->

Clear

Search syntax tips

Provide feedback

--> We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Cancel

Submit feedback

Saved searches

Use saved searches to filter your results more quickly

-->

Name

Query

To see all available qualifiers, see our documentation.

Cancel

Create saved search

/;ref_cta:Sign up;ref_loc:header logged out"}" Sign up

Appearance settings

Resetting focus

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

arsenis-cmd

clai-benchmarks

Public

Notifications You must be signed in to change notification settings

Fork

Star

main

BranchesTags

Go to file

CodeOpen more actions menu

Folders and files NameNameLast commit message Last commit date Latest commit

History 2 Commits 2 Commits

clai_engine

derivation

governance

.gitignore

LICENSE

NOTICE

README.md

requirements.txt

View all files

Repository files navigation

CLAI Benchmarks

Memory that rejects what it shouldn't learn, and derives what was never stored.

Most agent memory just stores and retrieves. CLAI vets what enters (governance ) and composes relationships that exist in no single document (derivation ) — two things retrieval-first memory can't do at the write path.

The gap, in one line: a store-everything memory admits 7 / 7 poisoned facts; CLAI admits 0 / 7 . And on multi-hop questions a knowledge graph splits into dead-ends (0 / 3 ), CLAI derives the answer (3 / 3 ). Both reproducible below — the baseline side runs live.

This repository holds two reproducible head-to-head benchmarks . The baseline side of each runs locally with no dependencies , so you can see the gap yourself. The CLAI engine is proprietary; these demos call it as a black box , and the recorded CLAI results (JSON + tables) are committed so the comparison is complete even without engine access.

Engine / early access — waitlist: https://clai-three.vercel.app

Result 1 — Governance: retrieval ≠ governance

Feed the same 26-fact knowledge base (7 poisoned) to a generic store-everything memory and to CLAI's governed admission , same order, both systems.

The store-everything memory admits 7 / 7 poisoned facts and keeps no audit trail.

CLAI keeps 0 / 7 poison in memory, with 0 / 7 clean-fact over-rejection and a per-fact reason.

Downstream: CLAI returns the verified value on 5 / 5 probes — the poison was never stored.

→ Full methodology + honest notes

Result 2 — Derivation: derived, not extracted

Six 2-hop chains (person → company → city). On the hard chains the linking entity is mentioned two ways — e.g. "Orion Biotech" in one sentence, "Orion Biotechnology Incorporated" in the next.

An exact-match knowledge graph keys those as two nodes → the path splits → multi-hop dead-ends (hard 0 / 3 ). It still multi-hops fine on the controls (3 / 3 ).

CLAI resolves the variants to one entity and composes the answer across the gap (hard 3 / 3 , controls 3 / 3 ). Every answer is a real 2-hop derivation — the direct person → city edge is never stored.

→ Full methodology + honest notes

Run it yourself

The baselines are pure Python (3.9+ standard library) — no install needed :

git clone https://github.com/arsenis-cmd/clai-benchmarks && cd clai-benchmarks

python3 governance/run_governance.py # store-everything admits 7/7 poison, live python3 derivation/run_derivation.py # exact-match graph dead-ends on hard chains, live

Each script runs the baseline live and prints the recorded CLAI column next to it. The CLAI side is a black-box call into clai_engine; since the engine isn't in this public repo, it prints a clear "request access" message and points at the committed results in each results/ folder.

Regenerate the result images (optional — they're already committed):

pip install -r requirements.txt # matplotlib, only for rendering python3 governance/make_artifacts.py python3 derivation/make_artifacts.py

Honest scope (the things a sharp reader will ask)

n is small and illustrative — governance n = 26, derivation n = 6 chains. These are clean demonstrations of a mechanism, not leaderboard benchmarks. Larger automated benchmarks are the obvious follow-up.

Governance is architectural / LLM-independent — the gap is about having a governed write path at all, not about which model sits behind it.

The derivation gap is specifically the...

Agent-memory systems admit poisoned facts – a reproducible benchmark

Related Articles

Apple WWDC 2026 Livestream

Claude Fable 5

US Government directive to suspend access to Fable 5 and Mythos 5

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org