When a database system executes multiple transactions concurrently, the transactions may interfere with each other to produce an incorrect result. To isolate transactions from each other and prevent such interference, database systems implement some form of concurrency control mechanism. There are several different levels of isolation; for instance, the American National Standards Institute (ANSI)/International Standards Organization (ISO) structured query language (SQL) standard defines four isolation levels: serializable, repeatable read, read committed, and read uncommitted. When a set of transactions of a database system run under the highest isolation level, serializable, the database system ensures that the result is the same as would be obtained if the transactions ran serially, one at a time, in some order. As a result, application developers do not need to be concerned that inconsistencies may creep into the database because transactions execute concurrently. Lower isolation levels prevent certain types of interference, but allow other types.
Conventionally, strict two-phase locking (S2PL) and various enhancements such as escrow locking and multigranularity locking have been used for concurrency control. However, S2PL and related techniques have several drawbacks: they can be expensive because of the cost of maintaining extra data structures storing locks and they may cause some transactions to block unnecessarily, thereby reducing the overall throughput of the system. Under high concurrency, the lock manager itself may become a bottleneck that limits the total throughput of the system. The overhead of locking can be a significant factor in main-memory database systems where the cost of accessing records is low.
Snapshot isolation (SI) is an alternative approach to concurrency control that takes advantage of multiple versions of each data item. A transaction T running under SI sees the database state as produced by all the transactions that committed before T started, but no effects are seen from transactions that overlap with T. This means that SI does not suffer from inconsistent reads. Transaction T will successfully commit only if no updates T has made conflict with any concurrent updates made since T's start. In a database management system (DBMS) using SI for concurrency control, read-only transactions are not delayed because of concurrent update transactions' writes, nor do read-only transactions cause delays in update transactions.
However, conventional SI allows some non-serializable executions and does not guarantee serializability. In particular, transactions that produce the correct result when run in isolation may, when run concurrently under SI, produce an incorrect result. The specific anomaly that may occur is known in the literature as “write skew”. Depending on the types of transactions that are received by a given database system and the type of application(s) that execute with respect to the given database, upfront management of the problem can be statically implemented to handle the occurrence of such write skews; however, such proactive and proprietary application management is expensive in terms of time, know-how, additional software coding, and expense introduced by the custom provision of such front end management of transactions.
One recent algorithm for achieving serializable SI concurrency control maintains two Boolean flags in each transaction object indicating, for every transaction T, if there is a read/write (rw)-dependency from a concurrent transaction to T, and if there is an rw-dependency from T to a concurrent transaction. However, a lock manager is required not only to maintain standard WRITE locks, but also to maintain snapshot isolation READ locks, introducing potentially significant overhead that may not be suited for all types of data, applications, and/or databases. The overhead associated with such locking and checking can limit overall throughput in systems with large amounts of data and high concurrency of transactions. Accordingly, more streamlined and flexible implementations for achieving serializable snapshot isolation are desired.
The above-described deficiencies of today's transaction concurrency control techniques are merely intended to provide an overview of some of the problems of conventional systems, and are not intended to be exhaustive. Other problems with conventional systems and corresponding benefits of the various non-limiting embodiments described herein may become further apparent upon review of the following description.