Show HN: Claude Code's $200 plan is a 17× subsidy on the raw API

Hiteshjain1183 pts0 comments

coral-ai/claude-code-token-xray at main · Coral-Bricks-AI/coral-ai · GitHub

//files/disambiguate" data-turbo-transient="true" />

Skip to content

Search or jump to...

Search code, repositories, users, issues, pull requests...

-->

Search

Clear

Search syntax tips

Provide feedback

--><br>We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Cancel

Submit feedback

Saved searches

Use saved searches to filter your results more quickly

-->

Name

Query

To see all available qualifiers, see our documentation.

Cancel

Create saved search

Sign in

//files/disambiguate;ref_cta:Sign up;ref_loc:header logged out"}"<br>Sign up

Appearance settings

Resetting focus

You signed in with another tab or window. Reload to refresh your session.<br>You signed out in another tab or window. Reload to refresh your session.<br>You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

{{ message }}

Coral-Bricks-AI

coral-ai

Public

Notifications<br>You must be signed in to change notification settings

Fork

Star<br>31

FilesExpand file tree

main

/claude-code-token-xray<br>Copy path

Directory actions

More options<br>More options

Directory actions

More options<br>More options

Latest commit

History<br>History<br>History

main

/claude-code-token-xray

Top

Folders and files<br>NameNameLast commit message<br>Last commit date<br>parent directory<br>..<br>assets

assets

README.md

README.md

cost.py

cost.py

main_vs_sidecar.py

main_vs_sidecar.py

requirements.txt

requirements.txt

reread_breakdown.py

reread_breakdown.py

token_time_breakdown.py

token_time_breakdown.py

View all files

README.md<br>Outline<br>claude-code-token-xray

Reverse-engineer a month of your own local Claude Code logs<br>(~/.claude/projects/*/*.jsonl) into where the tokens, time, and cost<br>actually go — and run it on yours. Reads only local logs ; nothing is sent anywhere.

What it found (one month of my own logs — 181 sessions, 25,564 model calls):

You don't pay to generate, you pay to re-read. ~29M unique tokens →<br>4.35B billed (~150×) , because every turn re-sends the whole ~173K-token context.

The bill is 84% input / 16% output — and re-reading the same context is 64% of it.

The biggest line is the one you never see: hidden reasoning is 84% of output<br>and ~60% of everything re-read.

~$3,371 for the month at Opus 4.7 list rates. Caching already serves 98% of<br>input — and re-reading is still 64% of the bill.

Full write-up (all the tables, the why, the main-thread-vs-subagent split) →<br>coralbricks.ai/blog/claude-code-token-xray

Quickstart

pip install -r requirements.txt # just tiktoken<br>python3 token_time_breakdown.py<br>python3 cost.py<br>python3 main_vs_sidecar.py<br>python3 reread_breakdown.py

tiktoken is OpenAI's tokenizer, not Claude's, so token proportions are<br>reliable to ~±15%, not Claude-exact. The billed-token counts in cost.py come<br>straight from the API usage blocks and are exact.

What a month cost

From cost.py on my logs, priced at Opus 4.7 list rates:

Line item<br>Cost<br>Share

Input — re-reading context (cache reads)<br>$2,176<br>64%

Input — cache writes<br>$682<br>20%

Input — fresh (uncached)<br>$2<br>0%

Output — reasoning<br>$429<br>13%

Output — tool calls + summaries<br>$82<br>2%

Total<br>$3,371<br>100%

Caching is the only thing keeping it sane — without it the same work lists at<br>~$22,630 (~7×). Your numbers will differ; that's the point. Run it on yours.

Scripts

token_time_breakdown.py — the headline table: tokens (marked input/output)<br>and wall-clock time per activity (reasoning, running commands, writing tool<br>calls, subagents, summaries, reading/searching, editing) plus the<br>passive-context rows (system prompt + tools, attachments, the typed prompt,<br>injected reminders). One pass, so tokens and time stay consistent. Reasoning<br>isn't stored in plaintext (only an encrypted signature), so it's recovered by<br>subtraction: output − tool_calls − summaries. Time is reconstructed from<br>event timestamps.

cost.py — billed token totals (cache reads / cache writes by TTL / fresh<br>input / output) priced at Opus 4.7 list rates, plus the no-caching<br>counterfactual.

main_vs_sidecar.py — splits the human-driven main thread from spawned<br>subagents (logged under nested */subagents/*.jsonl); reports billed tokens,<br>per-model mix, cache-hit rate, turns per agent (per session for the main<br>thread, per subagent for the sidecar), and cost for each, plus the combined<br>total.

reread_breakdown.py — per-activity cumulative input: replays each<br>session's context growth to show what each kind of context costs once it's<br>re-read every turn. Reports unique vs re-read tokens per activity (reasoning<br>is the biggest re-read line). The replay is scaled to the measured billed input<br>(exact); the per-activity split is a model.

Caveats

One person's month on one machine — directional, not a benchmark. Claude Code<br>is dynamic, so your split will differ. That's the point: run it on yours.

A generation-time gap also includes the model reading its context...

claude input cost code token context

Related Articles