The present invention relates generally to transaction processing and, more specifically, to supporting high throughput concurrent transactions without locks.
Transactions are units of change to a database. Transactions arise in real-world situations, such as when a person purchases items at a supermarket checkout, or when a person transfers money between bank accounts. Database management systems support transactions by guaranteeing certain fundamental properties: atomicity (the transaction executes in its entirety and cannot leave partial results); consistency (transactions are rejected if their updates would violate integrity constraints); isolation (transactions operate in a way that appears independent of other concurrent transactions); and durability (the effects of a committed transaction are permanent).
Many users may submit transactions concurrently to a database. If transactions operate on disjoint data, these transactions can proceed safely since there is no interaction between them. However, if two concurrent transactions access a common data item, and at least one of them is writing that data item, then an interaction is possible. The consequences of such an interaction can be serious, including the creation of a database state that could not have arisen had the transactions been executed in some serial order. The well-accepted definition of transaction schedule correctness, known as “serializability” requires that the database state be equivalent to one that would have resulted from some serial execution. Therefore, database management systems must somehow control the accesses made by transactions to avoid such undesirable interactions between transactions.
There are many well-known concurrency control algorithms in the literature. The simplest method is to run transactions one at a time, but such an algorithm performs poorly because no parallelism is possible. The two-phase locking (2PL) approach locks data items as they are read and written, and forces a lock request to wait if another transaction holds a conflicting lock on the same item. Optimistic concurrency control methods proceed without locking, but record an inventory of data items read and written. A check is made at transaction commit time to see if there may have been any conflicting operations made by recently committed transactions. If so, the transaction is aborted and restarted. In situations where the conflict probability is high, many transactions will be aborted. Yet another concurrency control method relies on timestamps. Data items have associated timestamps, and transactions are allowed to read and write data items only if the timestamp on the item is no later than the timestamp of the transaction. A transaction that violates this requirement is aborted and restarted with a new timestamp. A variant of timestamp based concurrency control keeps multiple versions of each data item, so that transactions can access older versions of the data items and thus abort less often. Each of these methods has drawbacks including delays caused by locks and wasted work caused by aborted transactions.