Agent Braille – 8-bit state encoding for LLM agents, ~92% fewer tokens than JSON

GitHub - Tetrahedroned/Agent-Braille: Deterministic 8-bit machine-to-machine protocol for AI agent state. ~92% fewer state-tracking tokens on real Claude Code sessions, a proven single-bit-error-safe command code, fully reproducible. · GitHub

/" data-turbo-transient="true" />

Search or jump to...

Search code, repositories, users, issues, pull requests...

-->

Clear

Search syntax tips

Provide feedback

--> We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Cancel

Submit feedback

Saved searches

Use saved searches to filter your results more quickly

-->

Name

Query

To see all available qualifiers, see our documentation.

Cancel

Create saved search

/;ref_cta:Sign up;ref_loc:header logged out"}" Sign up

Appearance settings

Resetting focus

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

Tetrahedroned

Agent-Braille

Public

Notifications You must be signed in to change notification settings

Fork

Star

main

BranchesTags

Go to file

CodeOpen more actions menu

Folders and files NameNameLast commit message Last commit date Latest commit

History 1 Commit 1 Commit

ab1

bench

paper

tests

.gitignore

AB-1_CANONICAL.md

CITATION.cff

LICENSE

README.md

View all files

Repository files navigation

Agent Braille (AB-1)

A measured agent-state protocol. Every claim here was tested, including the ones that failed; the failures and the fixes are in the open.

AB-1 is a deterministic 8-bit semiotic layer over the Unicode Braille Patterns block (U+2800–U+28FF) for machine-to-machine communication of AI agent state. One code point encodes one machine state across eight orthogonal dimensions of agency. This repository is the reference implementation, the benchmark harness, and the honest evidence ledger behind the paper.

The code is the canonical specification. Where the prose and the code disagree, the code wins.

What survives measurement (and what didn't)

This project's credibility is the arc, not a list of wins:

Atomicity — claimed, falsified, fixed with receipts. The premise that Braille cells are atomic single tokens is false on every stock production tokenizer (~3 tokens/cell on cl100k/o200k; BERT maps all 256 to [UNK]). AB-1 ships a vocabulary extension that makes every cell exactly one token, 256/256, exact round-trip, measured before/after.

Hardened lexicon — proven. Exhaustive enumeration over all 256 states and all single-bit flips: a single-parity-check command code (128 commands, distance 2) is single-error detecting with zero unvetted→audited promotions; an extended Hamming [8,4,4] code (16 commands, distance 4) is single-error correcting. Tokenizer- and model-independent.

Token efficiency — triangulated on public reproducible data. Against a steelman (delta-encoded JSON, same emit-on-change discipline), AB-1 carries agent state tracking in ~92% fewer tokens on stock cl100k/o200k, ~97% with the extension . Converged across a synthetic sweep, a private agent ledger, and the public Crucible Claude Code session log (anyone can reproduce the public number).

Separability — a scoped probe, explicitly NOT load-bearing. Characterized on three axes (quant depth F16→Q2_K, size 1B→14B, controlled cross-family at 3B/F16). No stock model robustly encodes AB-1 bit-structure at any size/quant/lineage; degradation appears only at the 2-bit extreme. This motivates the model-independent mechanisms; it is not the spine.

Security syntax-firewall — an explicit hypothesis, not a result.

Layout

ab1/ reference implementation (zero-dependency core) core.py 8-bit encoding, the bit-table convention dsp.py Differential State Protocol (emit-on-change) crc.py CRC-8 checksum cell lexicon.py hardened command codes + Hamming analysis tokenizer.py the vocabulary extension (atomicity by construction) bench/ reproducible experiments; bench/results/FINDINGS.md is the full honest ledger tests/ spec-anchor + roundtrip tests AB-1_CANONICAL.md the consolidated specification (CC-BY-4.0) paper/ the arXiv paper (LaTeX) + bibliography

Reproduce

python -m venv .venv && . .venv/bin/activate pip install tiktoken transformers # tokenizer experiments python tests/test_core.py # 5/5 spec + roundtrip python bench/tokenizer_parity.py # atomicity (negative result) python bench/extension_proof.py # the fix, before/after python bench/lexicon_proof.py # the proof (deterministic) python bench/token_reduction.py --trace bench/results/crucible_trace.json

Separability experiments (bench/quant_*.py, bench/cross_family.py) require local GGUF models and a GPU; they self-persist results to bench/results/ and survive a mid-run crash (incremental...

Agent Braille – 8-bit state encoding for LLM agents, ~92% fewer tokens than JSON

Related Articles

Elevated error rates on requests to multiple models

Donald Trump and sons to be 'forever' exempt from tax audits

PopuLoRA: Co-Evolving LLM Populations for Reasoning Self- Play

Old Reddit Is Down

The ultimate female fantasy – A feminist critique of Beauty and the Beast