How Lume Works: The Retrieval Primitives

How Lume Works: The Retrieval Primitives — Signal Log

DeepBlue Dynamics Signal Log lume-retrieval-primitives

Lume is a Rust hybrid search engine that Steve Harris and I have been building in the open at github.com/DeepBlueDynamics/lume. It’s a small CLI plus an MCP server, BSD-3 licensed, and built around a stubborn idea: when an agent asks a question, every step from query to evidence should be inspectable.

Lume indexes Markdown, source code, and PDFs (via a small Python extractor) and ranks over them with three independent primitives — field-aware BM25, dense GTR-T5 vectors via Shivvr, and a significance-scored entity graph. The lexical core and the graph run entirely on your machine; only the dense vectors call out, and that endpoint defaults to localhost. There is no opaque “search box that returns a ranking” — every score has a name, a file, and a knob.

This post walks Lume’s retrieval core end to end, with line-level references to the current tree. If you’re building agentic systems and tired of treating retrieval as a magic step, this is for you.

A few principles up front, because they explain the design:

Local-first. Lexical search and the entity graph run entirely on your machine. Dense vectors are fetched from Shivvr through SHIVVR_BASE_URL, which defaults to a local endpoint.

Layered, not monolithic. BM25, semantic, and graph are independent signals with their own scores. The blend is one line; each input is replaceable.

Auditable. The engine prints what it pruned, what it ranked, and why it rejected the rest.

0. The unit of retrieval: a Section

Lume indexes Markdown, cut into sections at # headers (parse_markdown in src/bm25.rs:211). A Section (src/bm25.rs:106) is the atom everything ranks over:

pub struct Section { pub title: String, pub body: String, pub line_number: usize, pub filename: Option, pub entities: Vec, // resolved named entities, for the graph

Title and body are separate fields with separate statistics — that distinction shows up immediately in scoring. The whole index lives in memory as a Bm25Index (src/bm25.rs:147): per-field term-frequency maps, document frequencies, field lengths, roaring-bitmap posting lists , prime/Gödel signature filters, and the entity posting lists that feed the graph.

1. Primitive: field-aware BM25

The lexical core is a field-aware BM25 with three selectable variants. The tuning defaults (Bm25Params in src/bm25.rs:125) are deliberately classic:

Self { k1: 1.2, b: 0.75, delta: 1.0, title_weight: 2.0, body_weight: 1.0 }

k1 controls term-frequency saturation; b controls length normalization. The one opinionated choice is title_weight: 2.0 : a title hit contributes twice as much as a body hit before the coordination factor is applied. That is useful, but it can overweight chapter titles when a query token is broad. Treat it as a knob, not a law.

IDF is the standard smoothed form, floored at zero, and each term’s contribution is computed per field then summed with the field weights (calculate_bm25_term_score in src/bm25.rs:728):

let len_normalization = 1.0 - b + b * (doc_len / avgdl); match variant { SearchVariant::Classic => idf * (tf * (k1 + 1.0)) / (tf + k1 * len_normalization), SearchVariant::Plus => idf * ((tf*(k1+1.0))/(tf + k1*len_normalization) + params.delta), SearchVariant::L => { let s = tf / len_normalization; idf * (s*(k1+1.0))/(s + k1) }, // total_score += title_weight * title_score + body_weight * body_score; (src/bm25.rs:635)

Classic is textbook BM25.

Plus adds a delta floor so a matched term never contributes nothing, countering BM25’s over-penalty of long documents.

L moves length normalization inside the saturation, smoothing very long docs.

Lume runs Classic by default (src/main.rs:1430).

2. Two-stage pruning: roaring union, then Gödel signatures

You don’t want to BM25-score all 1,926 sections of a book for every query. Lume’s search (src/bm25.rs:445) is two-stage .

Stage 1 — candidate gather. Union the roaring-bitmap posting lists of the query terms. This is a handful of bitset ORs and instantly narrows the corpus to sections that contain any query term:

// src/bm25.rs:460 let mut candidate_set = MiniRoaring::new(); let mut first = true; for q_tok in &query_tokens { if let Some(list) = self.posting_lists.get(&q_tok.bytes) { if first { candidate_set = list.clone(); first = false; } else { candidate_set = candidate_set.union(list);

Stage 1b — Gödel tag-signature pruning. If the query tagger recognizes entities, each candidate section is verified against a prime-factored signature filter (PrimeFilter::test_tag_prime in src/fast_retrieval.rs:449, evaluated in src/bm25.rs:538). Each known tag output maps to a prime; a section’s tag signature is the product of its tag primes, so inclusion is checked by divisibility. Unknown query tags deliberately receive a dummy prime and fail closed. Candidates that fail are dropped as TagSignatureMismatch...

How Lume Works: The Retrieval Primitives

Related Articles

Apple WWDC 2026 Livestream

Claude Fable 5

US Government directive to suspend access to Fable 5 and Mythos 5

Is AI ruining our skills? Early results are in – and they're not good

German ruling declares Google liable for false answers in AI Overviews