Root cause: ollama pull stuck at 99% — the 2-year-old bug · GitHub
/" data-turbo-transient="true" />
Skip to content
-->
Search Gists
Search Gists
Sign in
Sign up
You signed in with another tab or window. Reload to refresh your session.<br>You signed out in another tab or window. Reload to refresh your session.<br>You switched accounts on another tab or window. Reload to refresh your session.
Dismiss alert
{{ message }}
Instantly share code, notes, and snippets.
alvinttang/ollama-99-rootcause.md
Created<br>May 31, 2026 15:14
Show Gist options
Download ZIP
Star
(0)
You must be signed in to star a gist
Fork
(0)
You must be signed in to fork a gist
Embed
Select an option
Embed<br>Embed this gist in your website.
Share<br>Copy sharable link for this gist.
Clone via HTTPS<br>Clone using the web URL.
No results found
Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/alvinttang/94b5be372ced98fa3e3fba4d53dc9c51.js"></script>
" readonly="readonly" data-autoselect="true" data-target="primer-text-field.inputElement " aria-describedby="validation-550cb3f2-a36c-4cfd-bd66-b2be6975a584" class="form-control FormControl-monospace FormControl-input FormControl-small rounded-left-0 rounded-right-0 border-right-0" type="text" name="gist-share-url-sized-down" />
Save alvinttang/94b5be372ced98fa3e3fba4d53dc9c51 to your computer and use it in GitHub Desktop.
Embed
Select an option
Embed<br>Embed this gist in your website.
Share<br>Copy sharable link for this gist.
Clone via HTTPS<br>Clone using the web URL.
No results found
Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/alvinttang/94b5be372ced98fa3e3fba4d53dc9c51.js"></script>
" readonly="readonly" data-autoselect="true" data-target="primer-text-field.inputElement " aria-describedby="validation-9641bc2d-9c14-46ed-8959-60a6cc288761" class="form-control FormControl-monospace FormControl-input FormControl-small rounded-left-0 rounded-right-0 border-right-0" type="text" name="gist-share-url-original" />
Save alvinttang/94b5be372ced98fa3e3fba4d53dc9c51 to your computer and use it in GitHub Desktop.
Download ZIP
Root cause: ollama pull stuck at 99% — the 2-year-old bug
Raw
ollama-99-rootcause.md
The 2-year-old bug behind "ollama pull stuck at 99%"
If you've used ollama for any length of time, you've probably hit this:
pulling 9b6d12fa8910... 99% ▕████████████████████▏ 6.9 GB
…and then it just sits there. Your bandwidth is fine. The server is fine. The TCP connection looks alive. But ollama is wedged. After 30 minutes you Ctrl-C, run ollama pull again, and it finishes the last 1% in 3 seconds.
That's issue #1736 — 124 comments, 82 reactions, open since 2023. The world's most popular local-LLM runner has a download bug that 5,000 people have hit and 0 people fixed.
I spent a weekend on it. The root cause is a 5-line change. Here it is.
The maintainer was right — and that's why it stayed unfixed
Two years of "is this a Cloudflare problem?", "is this a Range header bug?", "is this NAT timeout?" — until a maintainer (mxyng) wrote this comment:
Certain parts stall completely and zero data is received from the backend. The connection itself is still healthy so it doesn't trigger a retry.
That's exact and correct. R2 (Cloudflare's object storage) occasionally drops streams: TCP stays connected, no FIN, no RST, just no more bytes. From Go's net/http perspective everything is fine, so the Read() call sits there forever.
This is a server-side problem. Ollama can't fix R2. The maintainer concluded the only real solution was to fix the storage backend.
That conclusion is what kept the bug open for 2 years.
It's wrong, but not for the reason you'd guess.
The actual root cause is in the client
Ollama already has a watchdog. Look at server/download.go:
g.Go(func() error {<br>ticker := time.NewTicker(time.Second)<br>for {<br>select {<br>case ticker.C:<br>if part.Completed.Load() >= part.Size {<br>return nil
part.lastUpdatedMu.Lock()<br>lastUpdated := part.lastUpdated<br>part.lastUpdatedMu.Unlock()
if !lastUpdated.IsZero() && time.Since(lastUpdated) > 30*time.Second {<br>// stall detected: fire errPartStalled<br>part.lastUpdated = time.Time{} // reset to zero<br>return errPartStalled<br>case ctx.Done():<br>return ctx.Err()<br>})
A goroutine wakes up every second. If the part hasn't made progress in 30 seconds, it fires errPartStalled, which triggers a retry on a fresh connection. This is exactly the right defensive code for the bug mxyng described.
So why doesn't it work?
Look at the guard condition: !lastUpdated.IsZero() && time.Since(lastUpdated) > 30*time.Second.
lastUpdated is set to a real time by the Write() method — i.e., only after the first byte arrives. Before any byte arrives, lastUpdated is the zero time. The guard skips the stall check when lastUpdated.IsZero().
This is fine on a fresh connection — there's a brief delay before bytes start flowing, and you don't...