Which ANSI SQL Isolation Level is Required to be ACID Compliant (or Does MongoDB Actually Support…

5 min read Original article ↗

Adam Prout

Many databases make loud claims about supporting ACID transactions, especially some NoSQL systems which traditionally only supported weaker transactions. This raises the obvious question: How do we tell if a system is actually ACID when ACID doesn’t have a precise formal definition? One approach is public testing along the lines of Jepsen, but this testing is not completely exhaustive (its not an official certification of ACID, but Jepsen does find many bugs). In this post, I want to talk about something a bit more specific that I think we can easily test for, namely which ANSI SQL isolation level does a database need to support to call itself ACID? For example, if a system supports snapshot isolation level and not serializable, can it claim to support ACID transactions?

The original properties for fault-tolerant databases that formed the early foundations of ACID were introduced by the legendary Jim Gray (a Turing award winner) and other early database researchers and practitioners in the 1970s and early 1980s. This is well before isolation levels were standardized by ANSI in SQL-92 in 1992. A paper by Haerder and Reuter in 1983 was the first to put many of Grays idea’s together and coin the acronym ACID: Principles of transaction-oriented database recovery. The paper describes isolated transactions (the I in ACID) as ensuring events within concurrent transactions are hidden from each other. The authors don’t precisely define “hidden” in the paper which is more focused on the techniques for building databases with atomic and durable transactions. Gray in his earlier 1981 paper on System R (predating the ACID paper by a few years) defines a consistent transaction as one that “occurs as though it had executed on a system which sequentially executes only one transaction at a time”. This is a good high level description of what would eventually be called the ANSI “serializable” isolation level. This statement by Gray is our first hint as to which isolation level ACID requires. The ANSI SQL isolation level definitions themselves make no mention of ACID. Isolation levels are more precisely defined then ACID, but are still open to some interpretation. I’ll defer to the best RDMS summary paper of all time for how ACID relates to isolation levels: “Architecture of a Database System” authored by Hellerstein, Stonebraker (another Turing award winner) and Hamilton. The paper is a little aged at this point but is still one of the best ways to learn about SQL database design and implementation. It claims that the “I” in ACID (Isolation) is only really achieved by serializable isolation level (sections 6.1 and 6.2). So, any database that doesn’t support ANSI serializable isn’t ACID compliant. In practice, no mainstream database runs with serializable transactions by default (its deadlock or abort prone under high concurrency). Some databases have different definitions of serializable, but all mainstream SQL databases do support it (Postgres, MySQL, Oracle, SQL Server, etc.). I suppose it’s unsurprising that ACID transactions require the strongest ANSI Isolation level. Its difficult to argue with the opinions of two Turing award winners.

Back to the question I posed earlier; what about databases that only support snapshot isolation (SI) level? SI executes a transaction on a consistent snapshot of the database and will abort it if its involved in a write-write conflict (i.e., if two transactions try to write to the same row concurrently). The problem with snapshot isolation level is that it has some reasonably edge case concurrency anomalies that serializable doesn’t have. The most well-known of these is write skew (which can sometimes even involve read-only transactions). For this reason, I would argue snapshot isolation level is not enough to claim ACID because its weaker then serializable. The I in ACID is violated by write skew.

So, databases like MongoDB that support snapshot isolation level but not serializable are not ACID compliant. SI is a good isolation level, but it doesn’t fully isolate concurrent transactions (the I in ACID) from each other. Transactions using weaker isolation levels (read committed, repeatable read, snapshot) are good for improving concurrency, but allow anomalies the app developer needs to be aware of. Mongo is not the only party guilty of miss using ACID — Couchbase also claims ACID with even weaker transaction support (they appear to support monotonic atomic view which is a bit stronger then read committed, but weaker then repeatable read). I get why database vendors miss use ACID. The average user of a databases has a fuzzy definition of what it means (not losing committed data, transactions being all or nothing, etc) and the vendors are “close enough” to ACID that customers won’t call them out on their not entirely accurate claims. My suggestion is that Mongo/others should be more transparent with their customers that they don’t have serializable isolation level so they don’t have ACID transactions (maybe they should try to make AC*D transactions a new acronym?).