coral-ai/claude-code-token-xray at main · Coral-Bricks-AI/coral-ai · GitHub
//files/disambiguate" data-turbo-transient="true" />
Skip to content
Search or jump to...
Search code, repositories, users, issues, pull requests...
-->
Search
Clear
Search syntax tips
Provide feedback
--><br>We read every piece of feedback, and take your input very seriously.
Include my email address so I can be contacted
Cancel
Submit feedback
Saved searches
Use saved searches to filter your results more quickly
-->
Name
Query
To see all available qualifiers, see our documentation.
Cancel
Create saved search
Sign in
//files/disambiguate;ref_cta:Sign up;ref_loc:header logged out"}"<br>Sign up
Appearance settings
Resetting focus
You signed in with another tab or window. Reload to refresh your session.<br>You signed out in another tab or window. Reload to refresh your session.<br>You switched accounts on another tab or window. Reload to refresh your session.
Dismiss alert
{{ message }}
Coral-Bricks-AI
coral-ai
Public
Notifications<br>You must be signed in to change notification settings
Fork
Star<br>31
FilesExpand file tree
main
/claude-code-token-xray<br>Copy path
Directory actions
More options<br>More options
Directory actions
More options<br>More options
Latest commit
History<br>History<br>History
main
/claude-code-token-xray
Top
Folders and files<br>NameNameLast commit message<br>Last commit date<br>parent directory<br>..<br>assets
assets
README.md
README.md
cost.py
cost.py
main_vs_sidecar.py
main_vs_sidecar.py
requirements.txt
requirements.txt
reread_breakdown.py
reread_breakdown.py
token_time_breakdown.py
token_time_breakdown.py
View all files
README.md<br>Outline<br>claude-code-token-xray
Reverse-engineer a month of your own local Claude Code logs<br>(~/.claude/projects/*/*.jsonl) into where the tokens, time, and cost<br>actually go — and run it on yours. Reads only local logs ; nothing is sent anywhere.
What it found (one month of my own logs — 181 sessions, 25,564 model calls):
You don't pay to generate, you pay to re-read. ~29M unique tokens →<br>4.35B billed (~150×) , because every turn re-sends the whole ~173K-token context.
The bill is 84% input / 16% output — and re-reading the same context is 64% of it.
The biggest line is the one you never see: hidden reasoning is 84% of output<br>and ~60% of everything re-read.
~$3,371 for the month at Opus 4.7 list rates. Caching already serves 98% of<br>input — and re-reading is still 64% of the bill.
Full write-up (all the tables, the why, the main-thread-vs-subagent split) →<br>coralbricks.ai/blog/claude-code-token-xray
Quickstart
pip install -r requirements.txt # just tiktoken<br>python3 token_time_breakdown.py<br>python3 cost.py<br>python3 main_vs_sidecar.py<br>python3 reread_breakdown.py
tiktoken is OpenAI's tokenizer, not Claude's, so token proportions are<br>reliable to ~±15%, not Claude-exact. The billed-token counts in cost.py come<br>straight from the API usage blocks and are exact.
What a month cost
From cost.py on my logs, priced at Opus 4.7 list rates:
Line item<br>Cost<br>Share
Input — re-reading context (cache reads)<br>$2,176<br>64%
Input — cache writes<br>$682<br>20%
Input — fresh (uncached)<br>$2<br>0%
Output — reasoning<br>$429<br>13%
Output — tool calls + summaries<br>$82<br>2%
Total<br>$3,371<br>100%
Caching is the only thing keeping it sane — without it the same work lists at<br>~$22,630 (~7×). Your numbers will differ; that's the point. Run it on yours.
Scripts
token_time_breakdown.py — the headline table: tokens (marked input/output)<br>and wall-clock time per activity (reasoning, running commands, writing tool<br>calls, subagents, summaries, reading/searching, editing) plus the<br>passive-context rows (system prompt + tools, attachments, the typed prompt,<br>injected reminders). One pass, so tokens and time stay consistent. Reasoning<br>isn't stored in plaintext (only an encrypted signature), so it's recovered by<br>subtraction: output − tool_calls − summaries. Time is reconstructed from<br>event timestamps.
cost.py — billed token totals (cache reads / cache writes by TTL / fresh<br>input / output) priced at Opus 4.7 list rates, plus the no-caching<br>counterfactual.
main_vs_sidecar.py — splits the human-driven main thread from spawned<br>subagents (logged under nested */subagents/*.jsonl); reports billed tokens,<br>per-model mix, cache-hit rate, turns per agent (per session for the main<br>thread, per subagent for the sidecar), and cost for each, plus the combined<br>total.
reread_breakdown.py — per-activity cumulative input: replays each<br>session's context growth to show what each kind of context costs once it's<br>re-read every turn. Reports unique vs re-read tokens per activity (reasoning<br>is the biggest re-read line). The replay is scaled to the measured billed input<br>(exact); the per-activity split is a model.
Caveats
One person's month on one machine — directional, not a benchmark. Claude Code<br>is dynamic, so your split will differ. That's the point: run it on yours.
A generation-time gap also includes the model reading its context...