It is an object of a database system to allow many users to use the same information at the same time, while making it seem that each user has exclusive access to all information. The database system should provide this service with minimal loss of performance (latency) and maximal transaction throughput. The service is generally provided by concurrency control mechanisms, but these mechanisms have problems, including: coordinating conflicting access to shared resources in a distributed environment, ensuring serial ordering and preventing deadlocks in a distributed environment and reducing communication and other overhead required to achieve these ends.
A number of researchers have published taxonomies of concurrency control mechanisms (CCMs), to assist in classification and analysis. The general consensus divides CCMs at a high level into “pessimistic” concurrency control (PCC) and “optimistic” concurrency control (OCC).
Pessimistic schemes control concurrency by preventing invalid use of resources. When one transaction attempts to use a resource in a way that could possibly invalidate the way another transaction has used the resource, PCC schemes cause the requesting transaction to wait until the resource is available for use without potential conflict.
The advantage of PCC is that it reduces the chance that a transaction will have to start over from scratch. Two disadvantages of PCC are that (1) there is an increased chance of unnecessary waiting, and (2) there needs to be a mechanism to detect deadlocks, or cycles of transactions all waiting for each other. In general, PCC works best in environments with a higher likelihood of transaction conflict, and where it is more costly to restart transactions.
Optimistic schemes control concurrency by detecting invalid use after the fact. They optimize the case where conflict is rare. The basic idea is to divide a transaction's lifetime into three phases: read, validate and publish. During the read phase, a transaction acquires resources without regard to conflict or validity, but it maintains a record of the set of resources it has used (a ReadSet or RS) and the set of resources it has modified (a WriteSet or WS). During the validation phase, the OCC examines the RS of the transaction and decides whether the current state of those resources has since changed. If the RS has changed, then the optimistic assumptions of the transaction were proved to have been wrong, and the system aborts the transaction. Otherwise, the system publishes the WS, committing the transaction's changes.
The advantages of OCC schemes are that they (1) avoid having a writer wait for a reader in most cases, thereby improving latency and throughput, and (2) avoid the need to implement deadlock detection. The disadvantages are that (1) there is an increased chance of unnecessary restarts and of “starvation” (a condition where a transaction is continually restarted without making progress), (2) validation in a distributed environment is difficult and can lead to deadlocks, and (3) in order to validate a correct serializable order in a distributed environment, validation must occur in two phases—local then global—which slows things down considerably. In general, OCC works best in environments in which there are many more readers than writers, where the likelihood of conflict is low, and the cost of restarting transactions that do experience conflict is acceptable.
Within the general categories of PCC and OCC, there are several major implementation techniques, including: locking, time stamping, multi-versioning, and serialization graph algorithms.
The most common locking scheme is called “strict two phase locking” (2PL). In 2PL schemes, a transaction cannot access or use a resource unless it first acquires a lock. Acquiring a lock gives the transaction permission to use a resource in a given way, for a given period of time. If a transaction cannot acquire a lock, it must wait, or give up. Locks come in a variety of types, each lock granting permission for a different kind of use. Different types of locks may be compatible or incompatible as applied to the same resource. In general, two transactions can both acquire read locks on a given record, but cannot both acquire write locks on the same record. Lock-based schemes provide a conflict table, which clarifies which lock types are compatible. In strict 2PL schemes, transactions hold their locks until they complete. Releasing a lock before completion can improve throughput in some situations, but opens up the possibility of a cascaded abort (where a transaction that previously committed must be rolled back).
Lock-based schemes have a variety of disadvantages. First, every attempt to use a resource must first acquire a lock. Most of the time, these locks will prove to be unnecessary; yet acquiring them takes time and uses up memory. Second, in situations where information is cached or replicated at multiple points in a computationally distributed environment, it can be challenging to coordinate locking all the replicas. Third, in a distributed environment where information resources can be physically relocated during transactions, it can be difficult to coordinate accessing the information in its new location with the locks in its old location.
An alternative to lock-based mechanisms is called time stamping (TS). The idea is to serialize transactions in the order in which they start. Lock-based mechanisms build on a “wound wait” (WW) scheme. In TS/WW schemes, when an earlier transaction requests a resource held by a later transaction, the system “wounds” the later transaction, so that the earlier one can proceed. Conversely, when a later transaction requests a resource held by an earlier transaction, the system causes the later transaction to “wait” for the completion of the earlier transaction.
The advantages of TS/WW systems are that they (1) are deadlock-free, (2) avoid the overhead of lock acquisition, and (3) can make local decisions about concurrency control that will be as correct in a global distributed environment as they are in a local central environment. The disadvantages are that (1) by insisting on serializing in start order, they abort otherwise serializable transaction histories, reducing throughput and opening up the possibility of starvation, (2) they are subject to cascaded aborts (a major performance problem) when a later transaction commits before it can be wounded, (3) they have an additional disk space and I/O cost in having to stamp records with the start time of their writer, and (4) comparing time stamps in a distributed environment can be costly with unsynchronized clocks.
Multi-versioning concurrency control (MVCC) utilizes cloned copies of a requested resource. Different copies could be given to different transactions to resolve some types of resource conflicts without waiting. When a writer modifies a resource in MVCC, the system clones a new version of the resource and brands it as belonging to the writer. When a reader requests the same resource, it can be given an appropriate version of the resource. Many systems have built upon the original MVCC scheme. These variations fall roughly into two groups. One group tries to minimize the number of versions, in order to keep down disk storage and I/O requirements. Another group of variations tries to minimize conflicts (maximize throughput) by keeping as many versions as necessary to prevent conflicts.
In general, the advantages of MVCC schemes are that they (1) allow readers and writers to access the same resources concurrently, without waiting, in most cases, (2) avoid lock overhead much of the time, and (3) avoid the problems of cascaded aborts. The disadvantages are that they (1) require significantly more disk storage and I/O time, and (2) present challenges in efficiently selecting the appropriate version for a given request.
If transactions executed in serial order, concurrency conflicts would never occur. Each such transaction would be the only transaction executing on the system at a given time, and would have exclusive use of the system's resources. A new transaction would see the results of previous transactions, plus its own changes; and would never see the results of transactions that had not yet started. In the real world, transactions execute concurrently, accessing and modifying resources during the same periods of time. Yet sometimes, the concurrent execution of multiple transactions in real-world-time can be equivalent to a serial execution order in virtual-database-time.
Serialization graph algorithms (SGAs) control the concurrent operation of temporally overlapping transactions by computing an equivalent serial ordering. SGAs try to ‘untangle’ a convoluted sequence of operations by multiple transactions into a single cohesive thread of execution. SGAs function by creating a serialization graph. The nodes in the graph correspond to transactions in the system. The arcs of the graph correspond to equivalent serial ordering. As arcs are added to the graph, the algorithms look for cycles. If there are no cycles, then the transactions have an equivalent serial order and consistency is assured. If a serialization cycle were found, however, then consistency would be compromised if all transactions in the cycle were allowed to commit. In this case, the SGA would restore consistency by aborting one or more of the transactions forming the cycle.
SGAs can be combined with other mechanisms such as time stamps or multi-versioning (MV-SGA). MV-SGAs, in particular, have many advantages over traditional CCMs. Read-only transactions can operate without read locks and without ever being rolled back. Read-write conflicts can often be resolved without waits, by establishing ordering relationships. Some write-write conflicts, between “pure” writes that do not read the affected data resource (e.g., INSERTs into a relational database table) or between arithmetically commutative operations (e.g., addition/subtraction), can be avoided as well.
Thus, an effective technique for controlling concurrency and ensuring the serializability of data base transactions that does not excessively impede overall performance is needed.