ContextWall – Context firewall for AI agents and RAG pipelines

sumeshpk1 pts0 comments

ContextWall: Context Firewall for AI Agents

Free early access · Apache 2.0 open source<br>Your AI agent reads untrusted content.<br>Every web result, document, and API response your agent retrieves goes straight into the model's context window - unscreened. ContextWall intercepts it first, blocks prompt injection and credential leaks, and enforces your security policy before the LLM ever sees it.<br>✕ Prompt injection✕ Poisoned RAG✕ Credential leaks✕ PII exfiltration<br>Get early access Self-host free<br>contextwall: live enforcement feed<br>--waiting_

No code changes to your agents · Runs in your infrastructure · No LLM in the screening path

Real production incidents, not theoretical threats<br>Your agent trusts everything it reads<br>LLMs have no built-in concept of source trust. Content retrieved from a web search and content from your system prompt look identical once they are both inside the context window. Attackers exploit this directly.

CVE-2025-32711<br>EchoLeak<br>Microsoft 365 Copilot

9.3 Critical<br>An attacker sends a crafted email. Copilot reads it, interprets embedded instructions as commands, silently accesses internal SharePoint files, and sends them to the attacker. The user never clicks anything.<br>WHY IT WORKEDCopilot had no way to distinguish a trusted system instruction from untrusted email content. Both looked the same inside the context window.

USENIX Security 2025<br>PoisonedRAG<br>RAG pipelines

90%+ manipulation rate<br>Researchers planted five adversarial documents into a knowledge base of millions. When users asked questions, the model retrieved and repeated the false content as confident fact, with no jailbreak, no system prompt change, and no model access needed.<br>WHY IT WORKEDThe RAG pipeline retrieved documents by relevance score and passed them straight to the model. There was no check on where the document came from or whether it should be trusted.

Both attacks exploited the same gap: no trust boundary at the context layer. ContextWall fixes this by tagging every context source with a trust tier and applying your policy rules before content reaches the model.

Who it's for<br>ContextWall is built for teams shipping AI into production who need security guarantees - not just guidelines.

AI & Agent Engineers

You're shipping RAG pipelines and agentic systems that pull from the web, internal docs, and third-party APIs. Every retrieved document is a potential attack vector - and your agent has no way to tell a legitimate source from a poisoned one.

How ContextWall helps<br>One pip install or Docker image - no changes to your agent code<br>Screens every document before it enters the prompt<br>Blocks injections and credential leaks before the LLM sees them

Security Teams

AI systems bypass your existing perimeter controls. Agents make outbound calls, ingest untrusted content, and operate with broad permissions - all outside your traditional detection stack.

How ContextWall helps<br>Enforceable policy rules per source, team, and repo<br>Real-time enforcement feed and tamper-evident audit log<br>Fleet-wide visibility across all deployed agents

Compliance & Legal

HIPAA, SOC 2, and FedRAMP auditors are asking how PHI can't leak through an AI agent's context window. You need evidence - not assurances.

How ContextWall helps<br>Every enforcement decision mapped to a compliance control ID<br>Cryptographically signed audit exports on demand<br>Documented data residency: context never leaves your infrastructure

What ContextWall stops - and what it doesn't<br>Detection at the context layer. No LLM in the screening path. We're honest about the scope.

Detected & blocked

Direct instruction overrideL1 + L2<br>"IGNORE ALL PREVIOUS INSTRUCTIONS…"

Bidi & zero-width obfuscationL1<br>RTL override chars hidden in retrieved text

Spaced-letter injectionL1<br>"i g n o r e p r e v i o u s"

Semantic paraphrase injectionL3<br>"Your assignment has been superseded…"

Credential leakageL2<br>AWS keys, GitHub PATs, bearer tokens

PII exfiltration via contextL2<br>Emails, SSNs in untrusted-tier documents

L1 = Structural scanL2 = Normalized regexL3 = Heuristic scoring

Out of scope

Model hallucinations<br>ContextWall filters what enters the context window - it cannot control what the model generates from clean inputs.

System prompt mistakes<br>If your system prompt grants excessive permissions, ContextWall cannot override that design decision.

Training-time poisoning<br>Attacks on model weights or fine-tuning data happen before inference. ContextWall operates at inference time only.

Novel zero-day patterns<br>L3 heuristics catch known semantic paraphrases. A sufficiently novel attack may score below the block threshold - you set that threshold.

Authorized access you've allowed<br>If your policy permits a source and the model uses that data, ContextWall enforces your policy - not a stricter one.

Honest scope beats false assurances. Defense in depth means ContextWall works alongside your model provider's safety filters, not instead of them.

How it works<br>ContextWall intercepts every document...

contextwall context model agent content prompt

Related Articles