Onhand: An AI tutor that lives on the page you're reading

phineas15001 pts1 comments

Onhand: a reading companion that points

Chrome Web Store live

Store · latest build

Apache 2.0

For Chromium

Built with OpenAI Codex

Onhand Free · Codex · provider keys

A reading companion that points. ☞

Onhand is a contextual AI for learning and research. Ask about the page<br>you already have open. It highlights the line, leaves a note in the margin,<br>and explains it in the side panel, where you are. No second window, no copy-paste.

Add to Chrome

Read the source

Realtime voice tutor<br>1 min 36 sec demo

nlp.seas.harvard.edu/annotated-transformer/

The Annotated Transformer

Attention

Attention maps a query and key-value pairs to an output. The model<br>compares the query with each key, then uses the resulting weights<br>to average the values.

We call our particular attention<br>Scaled Dot-Product Attention. Queries and keys have dimension<br>dk; values have dimension dv.

Onhand

This paragraph names the Q/K/V setup: compare queries to keys, then average values with the resulting weights.

MatMul<br>Scale<br>Mask<br>SoftMax<br>MatMul<br>Q K V

In implementation, the attention scores are produced by multiplying<br>the query matrix by the transposed key matrix, scaling by<br>sqrt(dk), and normalizing with softmax.

scores = torch.matmul(query, key.transpose(-2, -1))<br>scores = scores / math.sqrt(query.size(-1))<br>p_attn = scores.softmax(dim=-1)<br>return torch.matmul(p_attn, value)

Softmax turns the scaled scores into attention weights, so the final<br>output is a weighted mixture of the value vectors.

Explain scaled dot-product attention...

On this page · 2 highlights, 1 note

1 We call our particular attention<br>2 scores = torch.matmul(...)

10:55 PM · Onhand · Page-grounded

Explain scaled dot-product attention and why the scaling is needed; point me to the key code and equation.

Scaled dot-product attention compares every query with every key,<br>divides the scores by sqrt(dk), applies softmax,<br>then uses those weights to mix the values.

Attention(Q,K,V) = softmax(QKT / √dk)V

Queries ask what each token is looking for.

Keys describe what each token can match.

The dot product gives the compatibility score.

Scaling keeps large dimensions from making softmax too sharp.

softmax converts scores into weights.

Ask about this page or your selection...<br>Ask

Try the citations: each one jumps back to the highlighted page evidence.

The manicule , a small pointing hand in the margin, has meant<br>"look here, this part matters" in books since the twelfth century.<br>Onhand brings the gesture back, with the model holding the pen.

The pointing hand · a brief defense of marginalia

☞What it does

Four things, done in the margin.

Onhand stays out of the way until you ask. When you do, it does<br>the four things a good study partner would do.

Points to the answer

Highlights the exact phrase or paragraph that answers your question, in gold, and scrolls the page to bring it into view. Block-level highlights for whole paragraphs, inline highlights for phrases.

background.js · highlight_text · scroll_to_annotation

Leaves a note in the margin

Drops a sentence or two next to your highlight, written in the article's own voice, so you can keep reading without losing your place.

show_note · pine-barred editorial callout

Replay artifacts

Every session is saved with its highlights, notes, transcript, and a snapshot of the page. Come back tomorrow to the exact state you left.

chrome.storage.local · sessions + screenshots

Voice tutor

Press Voice in the side panel and talk through a paper. The realtime model can see your highlights and the visible text.

★ gpt-realtime-2 · WebRTC · experimental

☞How it works

Three steps. No second window.

01 · OPEN

The page you're already reading.

Wikipedia, an arXiv paper, a Google Doc, a Substack. Anything Chromium can render. Onhand attaches to the active tab.

tab — active

arxiv.org/abs/1706.03762

Attention is all you need · pdf

02 · ASK

A question in the side panel.

Plain language. Onhand reads whatever it needs to ground the answer: the page, the selection, the visible headings.

side panel

Explain the scaled dot-product attention<br>and why we divide by √d_k.

03 · POINT

☞Onhand points at the answer.

Highlights the line, drops a margin note, answers in the side panel with citations back to the page. Then saves it all.

on the page

We call our particular attention "Scaled Dot-Product Attention"…

● ONHAND margin note added

☞Install

Onhand is on the Chrome Web Store.

Install the approved Chrome Web Store build in one click. Start with<br>Onhand Free, sign in with OpenAI Codex, or bring a provider API key.<br>The store is currently serving<br>and updates through Chrome.

Chrome Web Store · live now

Add Onhand to Chrome.

The approved store build installs normally and updates through Chrome.<br>Use the GitHub ZIP only if you prefer manual installs or need a specific<br>release artifact.

Add to Chrome

Download latest ZIP

Release notes ↗

Build from source instead

terminal · build the extension

# Clone and build<br>$ git...

onhand attention page chrome scores highlights

Related Articles