A database is an ordered collection of data on which read/write operations can be performed. A database system that handles large volumes of data is generally not confined to a single computing device or even a single data center. Instead, a large database system is typically divided into shards, some of which may be located in one computing device or data center and others in another computing device or data center.
In a database system, some properties, e.g., consistency, concurrency, atomicity and durability are generally desired. Consistency ensures that one client (e.g., person or computing device) accessing data has the same view of the data as another client accessing the same data at approximately the same time. Concurrency ensures that multiple clients can access the database system at the same time to read/write data. Atomicity ensures that a transaction succeeds only when all actions of the transaction succeed, preventing a partial-state scenario in which some actions succeed while others fail. Durability ensures that changes to the database persist once the transaction is committed. These properties are difficult to guarantee in a database system that has data stored in different shards.
One way existing systems implement consistency on a database system that has data stored in different shards is by using a locking mechanism. The locking mechanism acquires locks on the database rows across the different computing devices in order to perform writes on data corresponding to those database rows. Any subsequent read on those database rows can occur only after the locks have been released, increasing the latency for those subsequent read transactions. Thus, the locking mechanism for implementing consistency involves a tradeoff between consistency and latency for read transactions.
Some database systems utilize write-ahead logs (“WAL”) to provide atomicity and durability. Such database systems log each action on the WAL and execute them serially. For example, the database system would write the first action (updating A to A′) to the WAL and then perform the action, followed by the second action and finally the third action in a serial fashion. If the third action fails, the database system can recover by replaying actions from the WAL. Using a WAL, however, has issues. For example, read requests must consult the WAL before the database system. Moreover, efficiently distributing a WAL is also non-trivial.