Optimization tales with CockroachDB: the slow logout
Philippe Gaultier
Dark/Light
Body of work
Tags
Resume
⏴ Back to all articles
Published on 2026-06-30. Last modified on 2026-06-30.
Optimization tales with CockroachDB: the slow logout
SQL Optimization CockroachDB
Table of contents
The context
Investigation
But why?
First optimization: conditional write
Second optimization: read committed
One last hurdle: does the row even exist?
Final results
Conclusion
Quick question: What do you do when there is downtime at work? Read the news, tidy your inbox... Hunt for slow SQL queries?
I'm in the last bucket. Emboldened by my recent success in speeding up the password reset flow, where too many rows were scanned, I stumbled upon a query that looked so simple, yet was very slow and did thousands of retries, for an endpoint that is very heavily used... This piqued my interest.
As always, the work is open-source!
This is part 2 of my optimization adventures with CockroachDB. See part 1.
The context
150K retries, 500 ms statement time:
Contrary to part 1, this time, rows scanned and CPU time are completely fine. But there are way too many retries.
This query (and many variations of it, all suffering from the same issue), are executed when logging out a user. A user is logged out by setting active to false on their session row in the sessions table:
Sql
UPDATE sessions SET active = false WHERE token = ?
This is such a simple query. It uses the right index. And yet it misbehaves. Time to investigate!
Investigation
CockroachDB gives us a nice warning on the same page:
Plaintext
Error Code: 40001
Error Message: TransactionRetryWithProtoRefreshError: ReadWithinUncertaintyIntervalError: read at time 1781712863.936725360,0 encountered previous write with future timestamp 1781712863.957518522,0 within uncertainty interval 't
If it looks like gibberish to you... Know that it did to me initially. Let's unpack it slowly:
TransactionRetryWithProtoRefreshError: the transaction was aborted by the server and the client was instructed to retry. We'll see why in a second. This is completely expected and normal behavior in CockroachDB in the default isolation level (Serializable).
ReadWithinUncertaintyIntervalError: read [...] encountered previous write: We tried to read the row containing the session token, in order to write to it, but we encounter a value with a higher (meaning: more recent) timestamp. So we conservatively have to restart from the top: re-read the fresh row data.
But why?
In a nutshell: we do an unconditional write to the row. It might not look like it because there is a WHERE clause. But this WHERE clause is actually only used to find the one row to update (using the session token). Once we have found the row, we write active = false to it every time. Even if active is already false! We do not check active at all. So two or more concurrent writes will all compete on the same row.
Due to the implicit transaction wrapping our update, using the default isolation level SERIALIZABLE (the strictest), we fall victim to transaction contention.
Quoting the docs:
By default under SERIALIZABLE isolation, transactions that operate on the same index key values (specifically, that operate on the same column family for a given index key) are strictly serialized. To maintain this isolation, SERIALIZABLE transactions refresh their reads at commit time to verify that the values they read were not subsequently updated by other, concurrent transactions. If read refreshing is unsuccessful, then the transaction must be retried.
But this is wasted work, because the first write that succeeds is enough: once we have marked the row as active = false, no code ever toggles active back to true, this is a final state. So all subsequent writes to this row should be no-ops, instead of retrying a number of times, and finally succeeding, having achieved nothing!
And then is heard no more: it is a tale told by an idiot, full of sound and fury, signifying nothing.
Macbeth, Act 5, Scene 5
First optimization: conditional write
So, let's make the write conditional: we'll only write to the row if active is true:
Sql
UPDATE sessions SET active = false WHERE active = true AND token = ?
There is no semantic change, and yet, this means way fewer writes (up to 1, now) and fewer retries. Why? Because transaction conflicts in CockroachDB happen in two main cases, write-write and write-read:
CockroachDB's transactions allow the following types of conflicts that involve running into a write intent:
Write-write, where two transactions create write intents or acquire a lock on the same key.
Write-read, when a read encounters an existing write intent with a timestamp less than its own.
We now avoid the write-write scenario, yay! We still have the other case (write-read) that can happen:
Transaction A starts
Transaction B starts
Transaction A writes to the row
Transaction A commits
Transaction B reads from the...