Do we fear the serializable isolation level more than we fear subtle bugs? | by Evgeniy Ivanov | YDB.tech blogSitemapOpen in appSign up<br>Sign in
Medium Logo
Get app<br>Write
Search
Sign up<br>Sign in
YDB.tech blog
YDB is an open-source Distributed SQL Database that combines high availability and scalability and supports strict consistency and ACID transactions
Do we fear the serializable isolation level more than we fear subtle bugs?
Evgeniy Ivanov
9 min read·<br>Jul 2, 2024
Listen
Share
We don’t understand how applications are affected by lower isolation levels.<br>Maybe READ COMMITTED is good enough or maybe people don’t know how dirty their data actually is…<br>Andy Pavlo at SIGMOD 2017
Press enter or click to view image in full size
All you need is ACIDHighlights<br>Database transactions imply ACID properties, where “I” stands for isolation and concurrency control.<br>The (serializable) isolation property ensures that the result of concurrently executed transactions is the same as if they had been executed in some serial order.<br>Maintaining the serializable isolation level is not free in terms of performance.<br>Many DBMSs provide weaker isolation levels for performance reasons, leaving it up to the application developer to choose the proper one. Moreover, a weaker isolation level is often the default in monolithic databases, e.g., “read committed” in PostgreSQL and MySQL. Meanwhile, stronger defaults prevail in distributed databases: “repeatable read” in YugabyteDB and TiDB and “serializable” in CockroachDB and [YDB]YDB.<br>Weaker isolation levels might cause subtle concurrency bugs. These bugs might introduce security vulnerabilities.<br>Millions of dollars were stolen, particularly from BTC exchanges, because of concurrency bugs in database-related application logic. We will provide details on multiple cases in the next sections.<br>In this post, we try to answer two important questions:<br>Do weaker isolation levels cause concurrency bugs in real applications often enough?<br>Is the performance penalty of the serializable isolation level reasonable, or is it overrated?<br>After answering these questions, we concluded that using weaker isolation levels by default is a form of premature optimization and evil. Consider using serializable isolation as the default unless your DBMS is CockroachDB or YDB, where serializable isolation is already the default.<br>Isolation levels subtleties<br>Suppose a table with a single column named “color” contains strings “black” or “white”. One user wants to change all “white” colors to “black”, while another user concurrently tries to change “black” to “white”. In other words, there are two concurrent transactions:<br>--- Transaction 1 Transaction 2<br>UPDATE t SET color = 'black' UPDATE t SET color = 'white'<br>WHERE color = 'white'; WHERE color = 'black';What would be the result of these two transactions? Intuitively, all colors should either become black or white. But in database practice, the proper answer is “it depends on isolation level.”<br>Usually, when we say “transaction,” we suppose that the transaction meets the ACID safety properties:<br>A tomicity: either all parts of the transaction are committed or all parts are aborted. Martin Kleppmann suggests calling this property “Abortability” because it reflects the meaning more accurately and avoids confusion between atomic commit/abort and atomic visibility.<br>C onsistency: historically added for a better-sounding acronym and rather application-specific than DBMS-specific.<br>I solation: concurrently executed transactions are isolated from each other. The results of transaction execution should look like the transactions have been executed serially, one by one.<br>D urability: committed data is never lost.<br>While “Isolation” is supposed to originally mean “serializability”, there are weaker [isolation levels](https://en.wikipedia.org/wiki/Isolation_(database_systems) introduced as a trade-off between performance and safety:<br>Read uncommitted.<br>Read committed.<br>Repeatable read.<br>Serializable is the default isolation level at least since the SQL:1999 standard, including its recent version SQL:2023 (ISO/IEC 9075:2023). It is also the default isolation level in CockroachDB and YDB. However, many database vendors use weaker isolation levels by default, in particular:<br>“Read committed” in PostgreSQL and Oracle.<br>“Repeatable read” in MySQL/InnoDB (there is a subtlety, see below) in YugabyteDB.<br>Moreover, database vendors provide their own confusing naming conventions. For example, according the documentation “repeatable read” in MySQL/InnoDB provides its guarantees only to read-only transactions. That is why Hermitage notes that “repeatable read” in MySQL/InnoDB is rather “read committed” (“monotonic atomic view”). And Oracle’s “serializable” is actually not serializable, but “repeatable read (snapshot isolation)” (application developers have easy ways to work around this). For details, check this slightly old post or its newer 2022 revision and Hermitage’s page dedicated to Oracle....