The present invention is directed to resource-management systems and in particular to the locking strategies that they employ.
A resource-management system is typically implemented in a computer, including its various types of storage apparatus, programmed with appropriate software. A common type of resource-managment system, namely, a database-management system ("DBMS"), can be thought of as operating in two phases, the first of which may be called an design phase and the second of which may be called an execution phase.
FIG. 1 represents the design phase. In this phase, the DBMS 10 provides a database designer with ways of defining the "structure" of the data, i.e., the manner in which their storage is organized, and of defining transactions in which the data thus stored will be accessed by ultimate users.
The first function, performed in what we can call a "metadata manager" 12, typically responds to structure-defining (data-definition) commands entered by the designer in a high-level database language such as SQL. A relational DBMS, for instance, may accept a data-definition instruction of the following form:
CREATE TABLE INCOME PA1 NAME CHAR (20), PA1 INC DECIMAL (10,2) PA1 SSN CHAR (9); PA1 CREATE INDEX ON INCOME (INC); PA1 START TRANSACTION PA1 SELECT INC PA1 FROM INCOME PA1 WHERE SSN="123456789" PA1 COMMIT WORK;
Such a statement may establish that the database will include a relation, or table, called "income," in which each tuple, or record, includes three attributes, or fields, including a "NAME" field in the form of twenty characters, an "INC" (income) field of ten decimal digits with a decimal point two places from the right, and an "SSN" (social security number) field in the form of nine characters.
The database designer may use similar statements to define other tables that the database will include. He may further define one or more indexes, whose maintenance he knows will be valuable in the use of the database. For instance, he may employ a command such as:
to create an index of the records ordered by the values of the respective records' income fields.
The result of such commands is to cause entries reflecting them in a database catalog 14 that the DBMS creates and maintains for use in its access operations. Note that none of the foregoing activities provides actual database contents; attributes have been named, but no values of those attributes have necessarily been entered.
We will assume for present purposes that it is the end users who supply the database contents by various manipulating transactions. However, it is the database designer who defines the types of transaction routines that the end users invoke for this purpose. (Actually, a separate application programmer may perform this task. From here on, we will refer collectively to the various personnel, other than the end users, who operate on the database as the "definer.") That is, the definer generates transaction routines, which the DBMS's run-time system will perform in response to requests from end users. To produce these routines, the definer employs a group of DBMS modules to which we will refer as a "query compiler" 16.
The definer may write the transaction definition in a general-purpose language such as PL/1, but he will typically embed in the PL/1 program instructions written in the high-level database language, such as SQL, to which the DBMS responds. This can be done in PL/1, for instance, by preceding, say, SQL commands with "EXEC SQL," which is a signal that the overall PL/1 source code will have to be submitted to a precompiler before PL/1 compilation so as to strip out the SQL statements and replace them with appropriate subroutine calls. The stripped-out SQL statements would be presented to the query compiler, which would compile them into instructions that call upon the services of various operative modules within the DBMS's run-time supervisor.
A transaction-defining statement in the high-level database language may be in the form:
When the routine specified by such a statement runs, it searches the database for a record in which the SSN (Social Security Number) field is 123 45 6789 and fetches the value in the income field of that record.
To compile this command, the query compiler 16 consults the database catalog, which contains the definer's definitions of the data organization, such as that "INCOME" is a relation and INC and SSN are two of its attributes. It also determines the best manner in which to search for the indicated information, and it employs the index information in order to do this. If, for instance, the definer has required that an index ordered by social-security number be provided, then the query compiler produces a transaction routine that accesses the record by way of that index. In such a situation, the SSN attribute is the "key" by which the system finds the desired record (or records).
It should be emphasized at this point that, although we depict the query compiler as operating in the design phase rather than in the execution phase, many of what will be described below as features of the query compiler are only implicit in most DBMS implementations until the execution phase is in progress. In particular, the typical query compiler places in the transaction routine 17 calls to operation subroutines chosen from a subroutine set resident only during the execution phase and commonly called by most transaction routines. Since the contents of these subroutines are all part of the translation from transaction definition to implementation instructions, we consider them features of the query compiler, and the remaining discussion will not distinguish between the functions that the query compiler performs directly during the design phase and those that its design-phase actions perform only indirectly, by calling existing subroutines resident during the execution phase.
Among the capabilities of many DBMSs is that of maintaining what is known as "serializability." In writing transaction definitions, the database definer is defining what will happen when an end user submits a request to the system. One resulting transaction routine, for instance, might be invoked by a user at an automatic teller machine to transfer money from one account to another. Another might be invoked by a bank executive from a management-information-system terminal to request the total of all account balances for a given branch of the bank. In each case, the end user submits his transaction request to a run-time supervisor 18 (FIG. 2), which calls the specified transaction routine. The transaction routine obtains records from a storage medium 19, such as a magnetic disk or cached copies of its contents, through the operation of a buffer manager 20. The actual central-processor time involved in each of these transactions may be very small, but the time required between the beginning and end of the transaction may be considerable, since time is usually required to await data from storage facilities, inputs from the human user, etc. In the operation of most large databases, it is therefore important that the central processor be freed to perform operations of other transactions between the individual operations of any single transaction. But this interleaving can cause problems if steps are not taken to prevent them.
For example, the ATM user's transfer of money from one account to another may actually be implemented in a number of operations, which may be interspersed with inputs from the user. The transfer involves the separate steps of removing money from one account and adding it to another. Similarly, the bank executive's transaction of obtaining account totals actually includes a number of separate operations, each of which involves reading an account balance and adding it to a running total. If these operations are interleaved, the account-totaling operation may copy the balance from the ATM user's first account before the transfer and from the second account after the transfer, and this would indicate to the executive that the bank total is greater than it actually is.
This would not result, of course, if the two transactions occurred serially. One of the functions of most large DBMSs is therefore to perform transactions in such a way that concurrently performed sets of them are serializable, i.e., that their results are the same as that of a series of non-concurrent transactions, without requiring actual serial transaction ordering. To this end, the query compiler usually causes the transaction routine to include certain operations that invoke the services of a DBMS module known as a "lock manager" 21, which is active in the execution phase depicted in FIG. 2 (and actually in the design phase, too, for purposes not relevant here) and maintains a lock table 22, whose contents indicate which "resources" are currently involved in transactions in such a manner that certain operations on them by other transactions must be postponed until the previous transactions have been completed. That is, if a transaction performed by such a routine includes an access to a certain resource, it will also request that the lock manager post a lock in the lock table identifying the designated resource as one to which access is restricted. (We use the more-general term resources instead of records for reasons that will become apparent. Until those reasons do become apparent, however, little harm results from reading "records" for "resources.")
When an end user invokes a transaction routine in the execution phase, that routine will request that the lock manager post a lock on the required resource. The lock request includes a "resource ID," which typically is, say, a hashed or otherwise encoded version of the key value, and an indication of the requested lock's "mode." (As will be explained below, locks can differ in their degrees of restrictiveness.) In response, the lock manager will consult the lock table to determine whether any lock has been posted for that resource ID. If so, it will consult its "compatibility matrix" to determine whether the requested lock is compatible with the already-posted lock, and the transaction that has requested the lock will be at least temporarily prevented from proceeding if the lock modes are not compatible. Otherwise, the lock manager will produce a positive compatibility indication and (except in certain "instant lock" cases that will be described below) enter a lock in the lock table, thereby restricting other transactions' access to the locked resource.
Of particular interest in the present context is what is known as "range locking." The serializability of some types of transactions is not assured by simply locking the several records to which they require access. An example of such a transaction is one that includes a scan operation, which accesses all records in which a certain attribute's values are within a specified range.
For instance, one transaction may be to find the total income within an income range. A concurrent transaction may be to insert records for a group of new taxpayers who first have reportable income within a given period. It is desirable that the computed total reflect either all or none of the new taxpayers, but individual record locking does not insure this result. If the insertion transaction inserts one record in a part of this range that the totaling transaction has already searched and another in a part that it has not, then neither transaction will encounter the other's locks, and the total-computing transaction's result will be erroneous. This results from the "phantom" records that were inserted into a range after that range's original occupants were locked.
To remedy this problem, database management systems must lock entities other than just records. For example, a DBMS may lock a whole file when a scan operation is performed in it. But such an expedient can produce a considerable reduction in the system's achievable concurrency, i.e., in the degree to which operations of separate transactions can be interleaved. Alternatively, therefore, DBMSs sometimes employ range locking, which treats not only individual key values but also key-value ranges as resources to be listed in the lock table. Therefore, when a transaction routine includes an operation directed to all records within a certain target key range, it causes the lock manager to post locks directed not only to the individual key values to which the transaction will obtain access but also to a set of lockable ranges that covers the target range. And, if another operation is directed to a specific key value, that transaction not only requests a lock on that key value but also has the lock manager check for locks on any lockable range into which that key value falls. This latter function is performed by additionally identifying any such range resource and requesting a lock on it.
The routine for inserting records of new taxpayers, for instance, would identify all ranges into which the records that it intends to insert fall, and it would check the lock manager to determine whether there are locks on these ranges. Since the total-computing transaction will have acquired locks on ranges into which the new taxpayer records would otherwise have been inserted, the record-insertion transaction is postponed until the total-computation transaction has been completed, and serializability is thereby maintained.
Note that the use of ranges as lockable resources does not change the lock manager. A conventional lock manager is quite simple. It does not have to "know" what kind of resource a resource ID represents. It needs to "know" only the compatibility matrix and the resource IDs against which it has already posted locks.
We digress at this point to note that locking a key value or range thereof is not the same as locking the record or records that the key value or range designates. The record identified by the name-attribute value John Doe may be the same as that identified by the social-security-number-attribute value 123 45 6789. But the key-value locking performed by an operation that uses John Doe as its key value and derives the resource ID from that value when it acquires a lock on it does not by itself restrict access to that record by an operation that requests a lock on the social-security number. Typically, of course, the operation will also perform further locking, such as record-ID locking, which will cause access to that record from other paths to be restricted. Indeed, one of the keys may in essence be the record ID. But we are not concerned with that here. Here we are concerned only with locking on the basis of a given key or range thereof, so we will refer to key-value or range locking, which may or may not be accompanied by or equivalent to record locking.
Although it is better than locking the whole file, locking ranges of key values can itself reduce concurrency significantly. To lessen this concurrency reduction, DBMSs employ more than one mode of lock, each lock mode differing from the others in restrictiveness. This restrictiveness variation is useful even if the DBMS does not use range locking. For example, a lock acquired by a transaction as a result of an operation that only reads records does not need to prevent other transactions from reading those same records, but a lock resulting from a write operation does. In recognition of this fact, a simple, two-mode locking system may employ lock-table entries that include an indication not only of whether a lock is in place but also of whether the lock is a share-mode ("S") lock, requested for read operations, or an exclusive-mode ("X") lock, requested by record-modifying operations. A transaction requesting an S lock will be deterred only by X locks on the target resource, not by S locks, while transactions that request X locks will be deterred by locks of both types.
Database systems that employ range locking often obtain additional concurrency by means of "multi-granularity locking," or MGL, which conventionally employs five lock modes, as FIGS. 3, 4, and 5 illustrate. FIG. 3 is a lock-mode table, which indicates the types of locks acquired by transactions in accordance with one application of the MGL scheme for the different types of database-access operations of which they may be comprised. In addition to the simple "covering" lock modes S and X, there are three "intention" lock modes, designated IS, IX, and SIX, whose purpose is to indicate, with respect to a key range, that a further, covering lock will be acquired on a key value that falls within that range. (Actually, the SIX mode, as will be seen below, is both a covering lock and an intention lock.)
FIG. 4 is an exemplary lock table that will be employed to explain intention locking. In the example, we will assume that one of the attributes in a relation defined for a given database is labeled "INC," for "income." That is, if the relation is thought of as a table, each row of the II table represents a taxpayer, and one of the columns represents those taxpayers' incomes. Let us further assume that the database designer has required (not necessarily in principle but nearly unavoidably in practice) that an index be maintained of incomes and that he has further specified that the DBMS is to treat certain predetermined ranges of incomes as lockable resources. The reason for this may be that a purpose of the database is to support studies of characteristics of various predetermined income ranges.
Now, suppose that a user calls for performance of a transaction routine for totalling all incomes within the $30,000-$70,000 income range and that the database designer has specified two predetermined ranges of $30,000 to $50,000 and $50,000 to $70,000. The DBMS's run-time supervisor responds to this request by running the requested transaction routine and assigning this instance of that routine's operation the transaction name T.sub.1. Since transaction T.sub.1 will need to read all records within the two ranges but not to update any of them, it requests S locks on the two ranges, as the first row of the FIG. 4 table indicates. The S lock is a covering lock, since it implicitly locks each key value in the range. But the transaction does not explicitly place a separate lock on each key value--i.e., each income level--that it finds, since the query compiler will have so designed any transaction routine that similarly searches for records by income as to cause the lock manager to check the income range into which the indicated income falls.
The "read scan" entry in the FIG. 3 mode table reflects this lock-mode selection. It shows that, upon a scan-type read operation, i.e., one which requests all records within a given range, a lock of the S type is acquired on the range or ranges involved but not on the individual key values.
As FIG. 5 indicates, lock mode S is compatible with only two lock modes, IS and S. The latter compatibility is exemplified by a concurrent transaction ("T.sub.2 "). Transaction T.sub.2 requires access to the S-locked ranges, but only to read them, so it requests S-mode locks of those ranges. The lock manager, which implements the compatibility table of FIG. 5, informs T.sub.2 that the requested lock mode is compatible with the existing (S-mode) locks on those ranges in its lock table. It therefore posts the lock, as the FIG. 4 lock table indicates in the T.sub.2 row.
To observe compatibility of the S mode with the IS mode, let us consider a transaction ("T.sub.3 ") whose purpose is to fetch, say, the identity of the taxpayer whose income is $39,547. This involves a "singleton read" operation, i.e., one that targets an individual key value, not a range, although the key value may fall within a range. As FIG. 3 indicates, the query compiler requires such an operation's transaction to request not only an S lock on the actual key value to which this operation is directed but also an IS lock on any range in which that key values falls. Accordingly, as FIG. 1 shows, T.sub.3 acquires an S lock on income value $39,547 and an IS lock on income range $30,000 to $50,000. The lock manager grants this lock, even though locks have already been acquired on that income range, because the lock manager's compatibility matrix (FIG. 5) indicates compatibility between the IS and S locks.
The IS lock is not a covering lock: the operation that caused it is not directed to the range against which the IS lock is posted. Instead, it is an intention lock: it indicates the intention to acquire a lock on a different resource, namely, a key value, that the range includes (or, for some resources, otherwise overlaps). Its purpose is to prevent another transaction from acquiring a covering lock on the range that would be inconsistent with the first transaction's access to the included (or overlapped) resource.
The difference between the IS and S locks becomes apparent when one considers a new transaction T.sub.4, whose purpose is to correct the income value for the taxpayer whose income is $41,290. This is a "singleton update," and FIG. 3 indicates that the involved range and key value must be locked in modes IX and X, respectively. FIG. 4 shows that no previous transaction has obtained a lock on the key value of $41,290. But that value falls within the $30,000 to $50,000 range, on which previous transactions have already acquired locks, and the mode, S, of two of those locks is incompatible with the IX-mode lock that transaction T.sub.4 requests on that range. Transaction T.sub.4 therefore cannot proceed until the S-lock-holding transactions T.sub.1 and T.sub.2 have ended, either by aborting or by running to completion, i.e., in database parlance, by "committing."
Now, let us suppose that transactions T.sub.1 and T.sub.2 commit. A committing transaction notifies the lock manager, which responds by removing the transaction's entries. FIG. 4 uses checkmarks to represent such removals for transactions T.sub.1 and T.sub.2. The only lock remaining on the $30,000 to $50,000 range after transactions T.sub.1 and T.sub.2 commit is therefore the IS-mode lock acquired by transaction T.sub.3. Remember that transaction T.sub.3 requested only an IS-mode lock on the range, since it did not need to read all of the range but only needed to indicate that it would be reading something within the range. Thus, as FIG. 3 indicates, the IS mode is compatible with the IX mode requested by transaction T.sub.4 on the $30,000 to $50,000 range. After transactions T.sub.1 and T.sub.2 have committed, therefore, transaction T.sub.4 can acquire the locks that it requires before it will proceed.
Note that transaction T.sub.4 acquires a different, X-mode lock on the particular key value, $41,290, to which its singleton update operation is directed. This mode is incompatible with the S-mode lock acquired by T.sub.3, but there is no interference, because the two transactions' S- and X-mode locks are directed to different resources. If T.sub.4 's update operation were instead directed to the income value $39,547, on which transaction T.sub.3 has acquired a lock, the requested locks would not have been granted to T.sub.4 ; if this were not so, T.sub.4 would be allowed, contrary to the serializability requirement, to modify a record that transaction T.sub.3 is reading.
The remaining lock mode in FIG. 5, SIX, has a compatibility combination that is the intersection of those of the S and IX modes: it is compatible with IS, with which S and IX are compatible, but it is not compatible with IX or S, since IX is not compatible with the latter and S is not compatible with the former. This is the mode requested by an "update scan" operation, which scans all records within a target range and updates any that meet a predetermined criterion.
For instance, if the records have as an additional attribute the name of the taxpayer's state, a transaction may be to correct a reporting anomaly by adding a fixed amount to the income field of each Kentucky taxpayer whose income falls within an income range of $35,000 through $60,000. Although the particular requested range is not among the resources that the database has been arranged to lock, the required serializability can be insured by locking the two illustrated ranges, which together include the requested range. This operation is in essence both a read scan and a series of singleton updates. The query compiler therefore causes such transaction routines to request that the ranges involved be locked in the SIX mode, which both prevents all transactions prevented by the S-mode lock and prevents all transactions prevented by the IX-mode lock. As FIG. 3 shows, such transactions additionally request X-mode locks on the individual key values that are in fact updated.
To complete our discussion of this particular type of range locking, we comment on two further details. The first is that the IS-mode range lock in the first row in FIG. 3 is only implicit in some implementations: in those implementations, the transaction does not actually request this lock and the lock manager does not post it. As reference to the compatibility matrix of FIG. 5 reveals, the only locking mode with which mode IS is incompatible is the X mode, which, as FIG. 3 shows, is ordinarily requested only for individual key values, not for ranges. Therefore, so long as the query compiler does not produce transaction routines that request X locks on ranges, a singleton read transaction does not need to acquire a range lock. But some MGL systems may save lock overhead by, say, locking the range in the X mode for insert and delete operations and thereby avoiding the need to acquire a key-value lock. In such arrangements, the IS lock mode would have to be explicit.
Incidentally, if the use of the X mode is to be restricted, as it is in FIG. 3, to key values rather than ranges, then it can be replaced with the SIX mode, which is incompatible with the only other mode used for key values, namely, the S mode.
The second detail is that, although FIG. 3 primarily describes the action of the query compiler and the routines that it generates, while FIG. 5 characterizes the lock manager, certain aspects of FIG. 3 can instead be implemented by the lock manager. Specifically, the first range-lock entries in FIG. 3 for singleton updates, inserts, and deletes are all IX. This is the lock mode that a transaction requests if that same transaction has not previously obtained an S-mode or SIX-mode lock on the same resource. If it has obtained such a lock, on the other hand, it acquires an SIX-mode lock instead of an IX-mode lock. It can be the transaction routine's responsibility to make this substitution, but one usually would incorporate this function in the lock manager instead.
It can be appreciated that range locking of this type eliminates the phantom problem yet affords considerable concurrency. As the last example indicates, however, the use of such definer-specified ranges is somewhat inflexible. Specifically, the target income range mentioned above for the Kentucky anomaly did not fit the predetermined ranges very closely, and locking enough predetermined ranges to cover the target range resulted in less concurrency than would have been possible if there had been a closer fit. Additionally, the approach to MGL just described requires considerable locking overhead, since it requires that two resources, namely, a key range and an individual key value, be separately locked for most operations.
A system that deals with these problems is the ARIES/KVL system described in Mohan, "ARIES/KVL: A Key-Value Locking Method for Concurrency Control of Multiaction Transactions Operating on B-TREE Indexes," Proc. Very Large Databases Conference, Brisbane, Australia (August 1990). This system dynamically re-defines key-value ranges in accordance with the current population of key values. Specifically, the system maintains a key-value-ordered index, and the possible lockable ranges are the ranges between each pair of successive key values that currently exist in the index. That is, if the existing key values are k.sub.1, k.sub.2, . . . , k.sub.i, . . . such that k.sub.i &lt;k.sub.i+1, then the ranges are the disjoint semi-open intervals (k.sub.i, k.sub.i+1 ], and each such range is identified by the upper bounding key value. Equivalently, of course, the range could be identified instead by the lower attribute value, in which case the range would be open at the top rather than at the bottom. Since locking granularity is finer if the ranges are disjoint, it is preferable for one end of the interval to be open. A lock-request resource identifier in this system identifies both a key value and the ranges that the key value represents. This enables a transaction to lock with a single lock request both a range and the key value that represents it.
FIG. 6 lists the lock modes required in the ARIES/KVL system for the various operations that a transaction might include. Unlike FIG. 3, FIG. 6 includes only one column for the first four operations listed, because an ARIES/KVL operation does not separately lock key values and key ranges. The locks listed in FIG. 6 are roughly the more restrictive of the range and key locks required by the system of FIG. 3 for the same operations. FIG. 6 does include a second column for insert and delete operations, but this does not indicate that key values and ranges are locked separately for these operations. Instead, it represents a separate lock on what will be described below as the "next" key value/range. This is necessitated by the fact that inserting and deleting records in a KVL system divides or extends existing key ranges.
Suppose that a record having a key value of k.sub.i is to be deleted. Clearly, one transaction's deletion of a record that has been written, read, or inserted by a second uncommitted transaction potentially compromises the serializability of those transactions. Therefore, an operation that deletes k.sub.i should not be permitted if another transaction has any lock on k.sub.i, regardless of the mode of that lock. A deleting transaction accordingly tests for any such lock by requesting an X-mode lock on k.sub.i. FIG. 6 includes the "instant" notation to indicate that the transaction need only request a compatibility indication and not actually acquire the lock, for a reason that will presently become apparent.
As FIG. 6 shows, the deleting transaction requests a lock not only on the targeted key value/range but also on the "next" key value/range, i.e., on key value/range k.sub.i+1, which includes the range previously represented by the deleted key value k.sub.i. Since that range has now been modified, no access to it should be permitted. The lock acquired on range k.sub.i+1 is therefore an X-mode lock.
For an insertion, on the other hand, the requested locks are not so restrictive. Let us suppose, for instance, that an inserting transaction is to insert a new key value k.sub.i ' between k.sub.i and k.sub.i+1. This will reduce the width of the range previously represented by key value k.sub.i+1, so range k.sub.i+1 must be checked for locks. For this purpose, ARIES/KVL uses an instant lock in the IX mode if the inserting transaction has not previously locked that key/range. Clearly, if that range had been, say, scanned by another uncommitted transaction, as indicated by an S, X, or SIX lock, that range should not be modified by inserting a new record into it. Testing by means of an IX-mode lock prevents this. However, there is no reason why the k.sub.i ' record cannot be inserted by one transaction just because another uncommitted transaction has previously inserted the k.sub.i+1 record, as indicated by a previously existing IX-mode lock. Since the requested IX-mode lock is compatible with an IX-mode lock, such an insert in front of another insert can occur. (In front of and behind are defined by the direction in which key-valued ranges extend: a range extends in front of the key value that represents it.) The inserting transaction requests only an instant lock on k.sub.i+1 because there is no reason why one transaction's insertion of k.sub.i ' should prevent another transaction's access to k.sub.i+1.
It is now apparent why the current-key-value/range lock required by a deleting transaction can be instant: subsequent insertion of a record having the deleted key value is prevented by the delete operation's next-key-value/range lock, which is incompatible with the next-key-value/range lock requested by the insert transaction.
Remember that the lock manager tests for compatibility only with existing locks of other transactions; if an inserting transaction has itself previously acquired an S, SIX, or X lock on the next key value k.sub.i+1, the lock manager does not produce an incompatibility indication even though the IX lock requested for that key is incompatible with those modes. Yet, because an insertion divides the range previously identified by k.sub.i+1, on which the requesting transaction has acquired such locks, provision must be made to insure that the entire previous range remains locked at least as restrictively after the insertion as it was before. Instead of IX, therefore, the lock on inserted key value k.sub.i is in mode X if the locking transaction has previously acquired an S-, SIX-, or X-mode lock on the next key value k.sub.i+1.
As was stated above, the ARIES/KVL system differs from the previously described MGL system. Instead of employing a fixed set of lockable ranges that is independent of the current population of existing key values, it dynamically so redefines the lockable ranges as key values are added and removed so that the currently existing key values bound the currently lockable ranges. This contributes to concurrency, because it tends to result in ranges whose granularity more nearly equals the required granularity.