Tokenspeed – How fast is 10 tokens per second really?

javatuts1 pts0 comments

tokenspeed — feel LLM tokens-per-second

How fast is 10 tokens per second really?

ccode<br>ttext<br>hthink<br>aagent

30<br>tok/s

15<br>210<br>320<br>430<br>560<br>6100<br>7200<br>8400<br>9800

Think length

5 sentences

ucustom text…

Use as text<br>Use as code<br>Clear

Upload file

space pause ·<br>+ / − adjust ·<br>1–9 presets ·<br>c / t / h / a mode · / > think length · u custom ·<br>n counter

0 tokens<br>⏸ PAUSED

Every local-LLM benchmark reports throughput: "47 tok/s on an M3,"<br>"180 tok/s on a 4090," "500 tok/s on Groq."<br>Unless you've actually watched tokens stream at those rates, the numbers are<br>hard to internalize. This is the rendering.

Four modes

code — syntax-highlighted pseudo-code, the most common thing you watch stream out of an LLM.

text — lorem ipsum prose, for the chat/answer case.

think — dim-italic reasoning sentences alternating with code, mimicking a reasoning model thinking out loud.

agent — alternating tool calls and code generation with processing pauses, simulating an AI coding agent.

What to try

Start at the default 30 and read along. Then hit<br>1 (5 tok/s — Raspberry-Pi-class local model),<br>5 (60 tok/s — typical hosted Claude or GPT),<br>7 (200 tok/s — Groq territory),<br>9 (800 tok/s — Cerebras-class, where the bottleneck is your eyeballs).

Now switch between c and t at the same rate.<br>The difference is striking — and intentional.

What counts as a token

This approximates BPE-style tokenization, not any vendor-specific encoder<br>(tiktoken, Claude's tokenizer, etc. — those disagree in the<br>details anyway).

Short words are often one token; longer identifiers split into chunks<br>(processUserInput → process + User + Input);<br>punctuation and operators usually count too.

Code is more token-dense than prose, so the same tok/s can feel very<br>different depending on what's streaming. The benchmark number is honest;<br>the perceptual effect varies a lot by content type — which is the gap this<br>tool exists to expose.

English prose averages ~1.3 tokens per word, so 30 tok/s ≈ 23 words/s.

tokens code second think text prose

Related Articles