William: a tiny poetry model in the browser · Akshit Kumar's index.html
William: a tiny poetry model in the browser
21 Jun 2026<br>William is a tiny local language model I trained to write short poems. The<br>model on this page is loaded by the browser and sampled locally, one token at a<br>time. There is no server endpoint behind the button.
loading William...
write
The title line is editable. William tokenizes it in the browser before<br>generating the poem.
William is a small decoder-only transformer: 6 layers, 384 hidden dimensions,<br>6 attention heads, and a 256-token context window. I trained it locally with<br>MLX on Apple Silicon.
The training pipeline was two-stage. First, the model learned general<br>poem-shaped text from the<br>biglam/gutenberg-poetry-corpus<br>line corpus after filtering out Project Gutenberg boilerplate, headers,<br>editorial apparatus, prose-like blocks, and non-English fragments. Then I<br>fine-tuned it on title/body poem pairs from<br>suayptalha/Poetry-Foundation-Poems,<br>with extra filtering for rows that were too long or too prose-like for the<br>short context window. I also used<br>prism-ml/Bonsai-8B-mlx-1bit<br>locally as a grading model to help reject low-fitness fine-tuning rows and<br>audit pretraining artifacts.
For this page, the MLX checkpoint was converted to ONNX and dynamically<br>quantized to int8. The page downloads that static model file and runs it with<br>ONNX Runtime Web in your browser; the model asset is around 14 MB.
Related posts
Interactive Shell Inside a Slurm Job<br>15 Nov 2025
Configuring Pyright with uv<br>15 Nov 2025
Custom uv Environments and Caches with direnv<br>24 Oct 2025