Scrutari – forensic statistical analyzer for opaque firmware blobs

xvilka1 pts0 comments

xvilka/scrutari: Binary blobs introspection tool - Codeberg.org

This website requires JavaScript.

xvilka/scrutari

Watch

Star

Fork

You've already forked scrutari

Code

Issues

Pull requests

Activity

Binary blobs introspection tool

52 commits

2 branches

0 tags

5.6 MiB

Rust

93.4%

Python

6.6%

main

Find a file

HTTPS

Download ZIP<br>Download TAR.GZ<br>Download BUNDLE

Open with VS Code

Open with VSCodium

Open with Intellij IDEA

XVilka

9b604731bc

List supported --probe choices in CLI help

2026-06-11 00:03:34 +04:00

docs

docs: correct Motorola baseband HAB/RSA marker descriptions

2026-06-01 20:53:08 +04:00

fuzz

fuzz: add a cargo-fuzz harness for the untrusted-input paths

2026-06-04 08:03:32 +04:00

scrutari

List supported --probe choices in CLI help

2026-06-11 00:03:34 +04:00

scrutari-py

Fix PyO3 bindings linking on macOS

2026-06-10 23:44:13 +04:00

scrutari-report

report, main: schema version, findings and feature vectors in JSON

2026-06-05 07:55:01 +04:00

scrutari-scan

scan: deterministic feature vector and uniform findings view

2026-06-05 07:55:01 +04:00

tests/verify

sbox_equiv: affine-equivalence matching for 8x8 S-boxes

2026-06-05 07:15:23 +04:00

.gitignore

Fix PyO3 bindings linking on macOS

2026-06-10 23:44:13 +04:00

Cargo.toml

scrutari-py: PyO3 bindings for notebooks and pipelines

2026-06-05 07:55:01 +04:00

COPYING.LESSER

Initial commit

2026-06-01 05:23:23 +04:00

LICENSE.md

Initial commit

2026-06-01 05:23:23 +04:00

README.md

docs: correct Motorola baseband HAB/RSA marker descriptions

2026-06-01 20:53:08 +04:00

README.md

scrutari

scrutari (Latin, from scruta, "trash"): to ransack through piles of<br>rubbish in hope of finding hidden value.

A forensic statistical analyzer for opaque firmware blobs. Given a packed<br>or encrypted-looking binary, scrutari figures out whether the<br>high-entropy region is compressed, encoded, encrypted, or otherwise<br>structured, and what kind of coder produced it — without needing<br>the algorithm specification, the original tooling, or a working<br>disassembler.

It does this in four layers:

Byte and bit statistics. Entropy, χ², mutual information,<br>autocorrelation, n-gram diversity, bit-level χ² across 4/8/12-bit<br>widths in both LE and BE byte-order interpretations. The<br>bit-level test alone classifies most payloads into<br>fixed-width packing vs. prefix/Huffman vs. range/arithmetic,<br>and the LE-vs-BE pair surfaces hidden u32-aligned structure on<br>firmware compiled for big-endian hosts.

Encryption-detection probes.<br>Block-cipher fingerprint (uniqueness ratios + per-position χ²),<br>Markov mutual information per N-byte block, period detection,<br>longest-repeat substring (via a true suffix array), crypto-constant<br>catalogue (Blowfish full P-box + S1-S4 + AES T-tables + SHA-512 K

ChaCha20 sigma + Twofish + Camellia + TEA delta + CRC family —<br>LE+BE variants), explicit ECB-mode duplicate-block detector at<br>N=8/16, periodic-XOR-keystream sweep (Kasiski/Friedman: payload<br>XOR payloaddocs/encryption-detection.md.

Brute force. 23 decoder families try every offset / parameter<br>combination over the candidate payload region: DEFLATE inflate,<br>LZ4, LZO1X, aPLib, LZMA1, word-LZSS, empirical Huffman,<br>LZ77+preloaded Huffman, a custom-LZ variant sweep, fixed-width<br>bit-unpacking, MTF-inverse, five Huffman variants (pair-list,<br>RLE-counts, word16, multi-table, FGK-adaptive), and five RLE<br>variants (escape, PackBits, zero-RLE, bit-run, dictionary). Plus<br>structural detectors for Qualcomm q6zip / delta-compress with<br>LE+BE byte-order sweeps and a 51-entry magic catalogue covering<br>HiSilicon hi3520, MediaTek BCR, FastLZ, miniLZO, LZ4-HC, Apple<br>LZFSE / LZHAM, Snappy, Qualcomm SBL, Apple IMG3/IMG4, Samsung<br>SBOOT, TI Davinci AIS, BE filesystem variants (ext2/3/4 BE,<br>cramfs BE, yaffs1, FIT/DTB BE), and the usual squashfs / JFFS2 /<br>UBI / U-Boot / Android-boot / zstd / xz.

Oracle filtering. Each candidate decode is scored by a<br>caller-selected "is this real code?" oracle. The oracle is a trait<br>(see docs/oracle-howto.md); scrutari ships<br>with oracles for every major embedded-firmware ISA, listed below.

Beyond these single-image probes, scrutari supports building corpora<br>of related firmware versions for cross-version differential analysis;<br>see docs/differential-analysis.md for<br>the methodology.

Scope: triage, not extraction

scrutari is a triage and fingerprinting tool. It answers "what is<br>this blob — cipher, compressor, codec, obfuscation — and what kind?"<br>and points you at the next step. It deliberately does not unpack<br>filesystems or carve out the contained files, and it is not a<br>binwalk/unblob replacement:

It does not extract or reconstruct a filesystem (SquashFS, JFFS2,<br>UBIFS, cramfs, …), even when its magic catalogue recognises one.

It does not decrypt: it can tell you a region is AES-ECB, a<br>periodic-XOR keystream, or an ESP32-style AES-XTS blob, and can<br>recover a static XOR keystream, but it will not produce plaintext<br>from a real block/stream cipher without the key.

It does...

scrutari firmware docs blobs byte huffman

Related Articles