In typical database systems, users store, update and retrieve information by submitting commands to a database application. To be correctly processed, the commands must comply with the database language that is supported by the database application. One popular database language is known as Structured Query Language (SQL).
A logical unit of work that is comprised of one or more database language statements is referred to as a transaction. When multiple users share a database system, it is impractical to allow only one transaction to execute at a time. However, when many transactions are allowed to execute at the same time, it is possible for the transactions to interact in undesirable ways.
To ensure predictable and reproducible results, techniques have been developed which protect concurrently executing transactions from interacting with one another in uncontrolled ways. Transactions are designed to make database changes in such a way that the database, as a whole, moves from one consistent state to another. As long as the database is consistent when the transaction completes, it is acceptable for the database to temporarily become inconsistent from the viewpoint of a single transaction during the execution of the transaction.
Not only must transactions leave the database in a consistent state upon completion, but transactions must also see the database in a consistent state when they begin. This condition is difficult to satisfy because there may be concurrently executing transactions, each of which may be causing a temporary inconsistency visible only within that single transaction. Consequently, transactions must not be able to see the changes made by concurrently executing transactions until those transactions commit (i.e. terminate and make their changes permanent).
To describe fully consistent transaction behavior when transactions execute concurrently, database researchers have defined a transaction isolation level called "serializability". In the serializable isolation level, transactions must execute in such a way that they appear to be executed one at a time ("serially"), rather than concurrently. Serializable mode prevents interactions between transactions that would preclude them from having the effect on the database as if they had executed in some serial order. In other words, concurrent transactions executing in serializable mode are only permitted to make database changes they could have made if the transactions had been scheduled to execute one after another, in some specific order, rather than concurrently. This mode ensures that transactions move the database from one consistent state to another consistent state by preventing potentially harmful interactions between concurrently executing transactions.
While the degree of isolation between transactions provided by serializability mode seems generally desirable, running many transactions in this mode can seriously compromise the transaction processing throughput of a database system. Complete isolation of concurrently running transactions could mean, for example, that one transaction could not do any inserts into a searchable collection of elements (a "table") that was being queried by another transaction. In short, real-world considerations usually make it necessary to provide isolation levels that compromise between perfect transaction isolation and database performance.
The ANSI/ISO SQL standard (SQL92) defines several levels of transaction isolation with differing degrees of impact on transaction processing throughput. These isolation levels are defined in terms of "phenomena" that must be prevented between concurrently executing transactions in order to achieve the required isolation. A greater degree of concurrency and better performance can be achieved using lower levels of transaction isolation, with the tradeoff that concurrent transactions may sometimes interact in specific ways.
SQL92 defines three "phenomena" that concurrently executing transactions may experience and four levels of isolation that provide increasing protection against these "phenomena". The phenomena include:
1) Dirty read: a transaction reads data that has been written by a transaction that has not yet committed. PA1 2) Non-repeatable (or "fuzzy") read: a transaction re-reads data it has previously read, and finds that another committed transaction has modified or deleted it. PA1 3) Phantom read: a transaction searches the database once. It repeats the same exact search and returns a larger set of data elements. PA1 Dirty write: only one transaction can hold a write lock on a specific data item. To preclude dirty writes, write locks must be held until a transaction commits. Virtually all database systems always prevent dirty writes, in order to support transaction rollback and prevent totally unpredictable and chaotic results. PA1 Dirty read: a transaction cannot acquire a read lock to read a data item locked for write by another transaction. This rule prevents a query from reading uncommitted data. Relaxing this constraint permits a transaction to see uncommitted data, as READ UNCOMMITTED would allow.
Although it is not explicitly required by SQL92, virtually all database products also preclude "dirty writes". A dirty write occurs when a transaction modifies a data item previously changed by another transaction before the transaction that made the first change commits.
The four levels of isolation in SQL92 are defined in terms of which of these "phenomena" a transaction is permitted to experience, as shown Table 1.
TABLE 1 ______________________________________ Non- Dirty Repeatable Phantom Read Read Read ______________________________________ READ Possible Possible Possible UNCOMMITTED READ Not Possible Possible COMMITTED Possible REPEATABLE Not Not Possible READ Possible Possible SERIALIZABLE Not Not Not Possible Possible Possible ______________________________________
While SQL92 defines the isolation levels that a database system should provide, it does not dictate any particular mechanism for implementing those isolation levels. Many database systems use read and write locks to provide transaction isolation. In systems that use read and write locks to provide transaction isolation, a transaction acquires a read lock on data items it intends to read, and write locks on data items it intends to write. Transactions can also acquire read or write locks, called "predicate locks," on a set of data items defined by a search or query condition.
The locks of two transactions on a given data item conflict if either or both are write locks. When a transaction attempts to acquire a lock that conflicts with one held by another transaction, the second transaction must wait until the conflicting lock is released. Generally, once a transaction acquires a lock, it holds the lock until the end (commit or rollback) of the transaction. By allowing a transaction to release a read lock before the end of the transaction, a system can provide more throughput by reducing contention between readers and writers. However, releasing locks before the end of a transaction exposes the transaction to various forms of inconsistency and possible data corruption.
Read-locking systems provide the various isolation levels defined in SQL92 by precluding the "phenomena" as follows:
Repeatable read: a transaction cannot acquire a write lock to change a data item read by another active transaction. This rule prevents a transaction from changing data that another transaction has read and may read again. Relaxing this constraint (by releasing a read lock before the reading transaction commits) means that a transaction can see data change between two reads, or see only some of the changes made by a committed transaction. READ COMMITTED isolation allows non-repeatable reads.
Phantom read: a transaction cannot insert new rows that satisfy an existing predicate read lock held by another transaction. Relaxing this constraint means that if a transaction issues the same query twice, it can see additional rows in the second query result. REPEATABLE READ allows phantom reads.
By preventing all the above "phenomena", read-locking systems provide the SQL92 isolation level SERIALIZABLE.
When isolation levels are implemented in the manner described above, read locks block attempts to write, and write locks block attempts to read, even when the blocked action would not create a prohibited interaction. By blocking actions that do not need to be blocked to provide the requisite isolation levels, the throughput of database systems is unnecessarily diminished.
Based on the foregoing, it is clearly desirable to provide a database system that implements the isolation levels specified in SQL92. It is further desirable to implement the isolation levels in a manner that reduces the number of conditions that result in blocked transactions.