Creating a Full PII Framework for Agents

PII Firewall - Privacy-first LLM applications

GitHubGet started

Open Source · Apache 2.0 Privacy firewall for LLM apps Intercept and anonymize PII before it reaches OpenAI, Anthropic, or any LLM — then rehydrate it in the response. Domain-aware, 55+ languages, 3 lines of code . 0+ Languages

Dispositions

Detection backends

Domain profiles

Quick start GitHub

pii_demo.py

from privacy_firewall import create_firewall

# Domain-aware -- keeps diagnoses, strips PII firewall = create_firewall("healthcare")

result = firewall.process( text="Patient John Doe, SSN 123-45-6789, diagnosed with hypertension.", context={...},

# -> "Patient [PERSON_001], [REDACTED], diagnosed # with hypertension." # Medical terms preserved. PII stripped. Clinical context preserved How it works Detect · Anonymize · Rehydrate A transparent privacy layer between your app and any LLM. Zero changes to your existing prompt logic.

01 Input "Patient Ana Garcia, DNI 12345678A, diagnosed with hypertension."Raw text containing PII arrives from user or upstream service.

02 Detect PERSON -- Ana Garcia NATIONAL_ID -- 12345678A DIAGNOSIS -- hypertension (keep)One or more backends (regex, Presidio, GLiNER, Transformers) detect entities. Domain rules decide what to keep.

03 Anonymize "Patient [PERSON_001], [REDACTED], diagnosed with hypertension."Entities replaced per their disposition: keep, pseudonymize, redact, generalize, mask, or hash. Profile rules decide which action applies per entity type.

04 -> LLM LLM processes sanitized prompt. Real PII never transmitted.Sanitized prompt forwarded to any provider: OpenAI, Anthropic, Mistral, local models. Zero changes to prompt logic.

05 Rehydrate "Patient Ana Garcia, DNI 12345678A, diagnosed with hypertension."Vault restores original values in the model's response. End-users see real names - the LLM never did.

01Input "Patient Ana Garcia, DNI 12345678A, diagnosed with hypertension."Raw text containing PII arrives from user or upstream service.

02Detect PERSON -- Ana Garcia NATIONAL_ID -- 12345678A DIAGNOSIS -- hypertension (keep)One or more backends (regex, Presidio, GLiNER, Transformers) detect entities. Domain rules decide what to keep.

03Anonymize "Patient [PERSON_001], [REDACTED], diagnosed with hypertension."Entities replaced per their disposition: keep, pseudonymize, redact, generalize, mask, or hash. Profile rules decide which action applies per entity type.

04-> LLM LLM processes sanitized prompt. Real PII never transmitted.Sanitized prompt forwarded to any provider: OpenAI, Anthropic, Mistral, local models. Zero changes to prompt logic.

05Rehydrate "Patient Ana Garcia, DNI 12345678A, diagnosed with hypertension."Vault restores original values in the model's response. End-users see real names - the LLM never did.

Domain Profiles Built-in presets for your industry Each domain profile decides what's sensitive and what the LLM must see to do its job. Fully customizable.

HealthcareFinanceLegalGenericCustom

Healthcare Profile Keep clinical context. Anonymize patient identifiers and account data.

✓ Keeps (pass-through) • Diagnoses (hipertensión, diabetes) • Medications (enalapril, lisinopril) • Procedures & observations

Transforms ActionEntityExamplePSEUDONYMIZEPERSONAna García → [PERSON_001]REDACTNATIONAL ID12345678A → [REDACTED]GENERALIZEAGE43 años → 40-49GENERALIZEDATE15/03/2024 → 2024REDACTEMAILana@clinic.es → [REDACTED]REDACTIBANES12345678 → [REDACTED]

Live example Input "Paciente Ana García, DNI 12345678A, 43 años, hipertensión. Consulta: 15/03/2024. Email: ana@clinic.es. Prescripción: enalapril 10mg." ↓ PII Firewall Output (sanitized) "Paciente [PERSON_001], [REDACTED], 40-49, hipertensión. Consulta: 2024. Email: [REDACTED]. Prescripción: enalapril 10mg." firewall = create_firewall("healthcare")

Detection Backends Mix and match detection engines Start with a preset, then swap in the engine that fits your data. Each card shows the exact install and firewall call.

base Regex

Structured IDs Emails & phones Credit cards Zero ML deps Best for: Zero-dependency environments or fast structured-data pipelines.

Create firewall Regex

pip install "pii-firewall"firewall = create_firewall("healthcare", detector_backend="regex")Customize: add_custom_regex(...)

recommended Presidio 50–200 ms

Named entities (persons, orgs) Multi-language NER Best speed/accuracy balance Extensible Best for: General-purpose production workloads with NER requirements.

Create firewall Presidio

pip install "pii-firewall[presidio,langdetect]"firewall = create_firewall("healthcare", detector_backend="presidio")Customize: custom_recognizers=[...]

zero-shot GLiNER 100–400 ms

Zero-shot NER No fine-tuning needed Custom entity types on the fly Best for: Custom entity types without labeled training data.

Create firewall GLiNER

pip install "pii-firewall[gliner]"firewall = create_firewall("healthcare", detector_backend="gliner")Customize: define your own entity labels

sector-specific Transformers 100–500...

Creating a Full PII Framework for Agents

Related Articles

Amazon, Facebook, FBI have access to a private intelligence-sharing network

SpaceX not the behemoth everyone thought

Elevated error rates on requests to multiple models

Donald Trump and sons to be 'forever' exempt from tax audits

PopuLoRA: Co-Evolving LLM Populations for Reasoning Self- Play