A zero-copy C/Python DFA engine that scrubs logs at 571 MB/s

LuciferLK1 pts0 comments

GitHub - thedevilhimselfcodes/guardog: A zero-copy C/Python DFA engine that scrubs logs at 571 MB/s · GitHub

/" data-turbo-transient="true" />

Skip to content

Search or jump to...

Search code, repositories, users, issues, pull requests...

-->

Search

Clear

Search syntax tips

Provide feedback

--><br>We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Cancel

Submit feedback

Saved searches

Use saved searches to filter your results more quickly

-->

Name

Query

To see all available qualifiers, see our documentation.

Cancel

Create saved search

Sign in

/;ref_cta:Sign up;ref_loc:header logged out"}"<br>Sign up

Appearance settings

Resetting focus

You signed in with another tab or window. Reload to refresh your session.<br>You signed out in another tab or window. Reload to refresh your session.<br>You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

{{ message }}

thedevilhimselfcodes

guardog

Public

Notifications<br>You must be signed in to change notification settings

Fork

Star

main

BranchesTags

Go to file

CodeOpen more actions menu

Folders and files<br>NameNameLast commit message<br>Last commit date<br>Latest commit

History<br>5 Commits<br>5 Commits

README.md

README.md

View all files

Repository files navigation

CyBurn Digital: Guardog Enterprise Firewall

Technical Evaluation & Asset Manifest

NOTICE OF ASSET SALE > The intellectual property, complete source code, and exclusive commercial rights to the Guardog Enterprise Engine are currently available for outright acquisition. This document serves as the technical evaluation manifest for prospective buyers. See Part 5: Acquisition & IP Transfer for terms.

Executive Summary

Guardog is a hyperscale, multi-threaded Deterministic Finite Automaton (DFA) text sanitization engine. Built as a native C-extension for Python, it is designed to intercept, scan, and scrub high-volume payloads (server logs, LLM prompt streams, API ingestion) for sensitive data in real-time.

Clocking at 571+ MB/s , Guardog operates at 14.4x the speed of standard sequential Python re implementations by bypassing the Global Interpreter Lock (GIL) and utilizing a Zero-Copy memory architecture. It is designed to immediately reduce CPU compute overhead and API latency in high-throughput environments.

Part 1: Architecture & Superiority (The Benchmarks)

Standard regex pipelines fail at hyperscale due to three bottlenecks: CPU thread locking, memory duplication, and backtracking latency. Guardog eliminates all three.

1. Zero-Copy Memory Mutation

Standard Python string manipulation forces the server to create complete copies of the payload in RAM. If a server ingests a 50MB log, standard engines spike memory usage by an additional 50MB to process it.<br>The Guardog Advantage: Guardog utilizes Python's Read-Write Buffer Protocol (w*). It passes a direct pointer to the bytearray into the C-engine. The C-engine reads and mutates the data in-place. RAM overhead during scanning is strictly 0 bytes , preventing Out-Of-Memory (OOM) crashes in containerized environments.

2. Lock-Free Map-Reduce (GIL Bypass)

Python’s GIL prevents true multi-core processing.<br>The Guardog Advantage: Guardog drops into raw C, releases the GIL, and deploys OpenMP multi-threading. It slices the payload across available CPU cores. C-threads map secrets into private memory structs (Map phase) without ever halting to acquire a lock. The GIL is only reacquired once at the very end to dump the structs into a Python list (Reduce phase).

3. O(1) Linear Time Execution

Standard regex uses Non-Deterministic Finite Automata (NFA), which suffers from "catastrophic backtracking."<br>The Guardog Advantage: Guardog pre-compiles all rules into an Aho-Corasick-style DFA matrix. The engine processes exactly one byte per clock cycle, regardless of how many rules are loaded. Whether scanning for 3 secrets or 3,000 secrets, the execution latency remains completely flat.

Part 2: Engineering Capabilities & Limitations

To maintain O(1) latency and Zero-Copy memory efficiency, specific computer science trade-offs were made. Engineering teams must evaluate the following structural profile:

What It CAN Do:

Cross-Boundary Secret Detection: OpenMP thread chunks feature a MAX_LOOKAHEAD overlap window. If a 20-character API key is perfectly sliced in half by two different CPU cores, the engine still successfully detects and redact it.

Overlapping Secret Resolution (Two-Pass Mutation): Guardog maps all coordinates in Pass 1, and mutates in Pass 2, guaranteeing 100% detection of adjacent or overlapping secrets (prevents the "Amnesia Overwrite" bug).

Non-Destructive Pipeline Integrity: Guardog operates on the raw byte layer. It does not force destructive Unicode normalization. Legitimate JSON structures, foreign languages, and mathematical symbols are passed through flawlessly.

Cryptographic Tamper Resistance: The compiled .matrix file is...

guardog python engine memory zero copy

Related Articles