GitHub - luckyrmp/tinderbox-archive: Personal claude.ai conversation archive: schema, ingest, embeddings, hybrid retrieval, MCP server · GitHub
/" data-turbo-transient="true" />
Skip to content
Search or jump to...
Search code, repositories, users, issues, pull requests...
-->
Search
Clear
Search syntax tips
Provide feedback
--><br>We read every piece of feedback, and take your input very seriously.
Include my email address so I can be contacted
Cancel
Submit feedback
Saved searches
Use saved searches to filter your results more quickly
-->
Name
Query
To see all available qualifiers, see our documentation.
Cancel
Create saved search
Sign in
/;ref_cta:Sign up;ref_loc:header logged out"}"<br>Sign up
Appearance settings
Resetting focus
You signed in with another tab or window. Reload to refresh your session.<br>You signed out in another tab or window. Reload to refresh your session.<br>You switched accounts on another tab or window. Reload to refresh your session.
Dismiss alert
{{ message }}
luckyrmp
tinderbox-archive
Public
Notifications<br>You must be signed in to change notification settings
Fork
Star
main
BranchesTags
Go to file
CodeOpen more actions menu
Folders and files<br>NameNameLast commit message<br>Last commit date<br>Latest commit
History<br>6 Commits<br>6 Commits
docs
docs
migrations
migrations
parser
parser
.env.example
.env.example
.gitignore
.gitignore
README.md
README.md
View all files
Repository files navigation
Tinderbox
A personal claude.ai conversation archive — schema, ingest, embeddings, hybrid retrieval, and an MCP server that lets any Claude session search your own past conversations.
Status: working end-to-end. Used daily by the author. Not packaged for general consumption — see Caveats below.
What it does
You export your conversations from claude.ai, drop the ZIP into a watched directory, and within ~15 minutes your full archive is searchable from any Claude session via two MCP tools:
tinderbox_search(query, limit=10) — hybrid semantic + full-text retrieval over every message and artifact
tinderbox_get_conversation(export_id, max_messages=50) — pull the full thread of any conversation surfaced by a search
Everything is local-ish — you bring your own Supabase free-tier project, your own Ollama install for embeddings, and a Mac (this is targeted at Apple Silicon).
The current author's archive: 676 conversations, 10,653 messages, 172 artifacts, 10,731 mxbai-embed-large vectors. Hybrid retrieval hits 68.7% top-1 / 88.7% top-10 on a frozen 150-query QA set generated by Haiku from the corpus itself (re-runs weekly via launchd).
Why it exists
Tinderbox stores statements, not facts. Every retrieval response renders the provenance inline — never "X is true", always "on [date], in [conversation], [participant] said [content]". The corpus answers what was said when, by whom; never what is true.
That's design principle #1 from the schema doc. Memorial archive, not extraction pipeline. Forward-linked when superseded, never backward-edited.
For context-window reasons it's also genuinely useful: a Claude session can look up its own past reasoning instead of re-deriving it.
Architecture (one paragraph)
Postgres (Supabase) holds 12 tables under a tinderbox schema — schema versioning, ingest runs, conversations, messages, artifacts, attachments, embeddings (vector(1024) + hnsw), enrichment, named_instances, query log, and a frozen QA test set. A Python parser stream-reads claude.ai export ZIPs and upserts everything idempotently. An embed worker batches messages and artifacts through Ollama (mxbai-embed-large, 1024-dim) and writes vectors back. A server-side Postgres function (tinderbox.hybrid_search) ranks results by (1 - cosine_distance) + 0.5 * ts_rank_cd. A small from-scratch JSON-RPC 2.0 MCP server exposes two tools over stdio. Three launchd daemons run the whole thing on a schedule: inbox watcher (15min), QA eval (Sundays 03:00), staleness alerter (daily 09:00 with cooldown + debounce).
Prerequisites
macOS with Apple Silicon, Python 3.14 (or 3.12+ probably — author runs 3.14.3_1)
Supabase free-tier project — $0/month for this scale. Optional $4/mo IPv4 add-on for proper RLS scoping (stage 5b).
Ollama running locally with mxbai-embed-large pulled (ollama pull mxbai-embed-large)
A claude.ai data export ZIP (Settings → Account → Export Data)
Quick setup
~/tinderbox && cd ~/tinderbox
# 2. Create your Supabase project, get URL + service-role key<br># 3. Render config + plists for your $HOME / $USER<br>./parser/scripts/setup.sh
# 4. Create your env file (path is configurable via TINDERBOX_ENV_FILE)<br>cp .env.example ~/.tinderbox.env<br># … and fill in SUPABASE_URL, SUPABASE_SERVICE_KEY, etc.
# 5. Apply the migrations to your Supabase project<br># (each migration file is plain SQL — run them in order via the Supabase<br># SQL editor, or via psql, or via your tool of choice)<br>ls migrations/
# 6. Pull the embedding model<br>ollama pull mxbai-embed-large
# 7. Set up the...