1. Field of the Invention
The present invention relates to resource management systems, and more particularly to a method and system for reverse scanning of index key value ranges of records in resource management systems.
2. Description of the Related Art
A resource-management system is typically implemented using a computer or a network of computers and a server, having storage capability and appropriate software. A database management system is one type of resource-management system.
A typical database management system includes both database files and index files. The database files store data in the rows and columns of tables stored on data pages. In such a table, the rows may correspond to individual records while the columns of the table represent attributes of the records. For example, in a customer information table of a database management system, each row might represent a different customer, while each column represents different attributes of the customers such as the name of each customer, the amount owed by each customer and the cash receipts received from each customer.
Instead of providing for direct sorting and searching of the records in the tables, the database management system relies on the index files, which contain information or pointers about the location of the records in the tables stored in the database files. The index can be searched and sorted (scanned) much more rapidly than can the database files. An index is scanned through transactions in which criteria are stipulated for selecting records from a table. These criteria include keys, which are the attributes by which the database finds the desired record or records using the index. The actions of a transaction that cause changes to recoverable data objects are recorded in a log. Each log record is assigned a unique log sequence number (LSN) when the record is written to the log.
All data is stored in tables on a set of data pages that are separate from the indexes. All of the indexes for a table contain only the key values and record identifiers (RIDs) of records containing these key values. A RID consist of a data page ID concatenated with a sequence number unique within that range. One common type of index is a B-tree having N levels of nodes or pages. The starting node at the top of the tree is called the root node and defines the interval of key values that the B-tree index covers. In the successive lower levels of nodes before the lowest level of nodes, this key value interval is broken up into key value sub-intervals. Finally, the leaf nodes or pages in the lowest level of the tree contain the individual key values within the interval, together with the associated record (row) identifications (RIDs) that enable the records having those key values as attributes to be located in the tables of the database files. The leaf pages of an index contain entries (keys) each of which is conceptually a {key-value, RID} pair where the RID is treated as if it were an extra key field. A non-unique index is one that may contain more than one key with the same key value. In contrast, a unique index cannot contain more than one key with the same key value. Keys are maintained in an ascending collating order on all key fields, including the RID. Leaf pages alone contain next- and previous-page pointers so that ascending and descending range scans can be supported. Non-leaf pages contain child page pointers. FIG. 17 shows a non-unique B-tree index.
Where a non-unique index contains duplicate instances of a key value, the key value is stored only once on each leaf page. This single value is followed by as many RIDs as would fit on that pagexe2x80x94this is called a cluster of duplicates. On leaf page 3 of the B-tree of FIG. 17, the key value P is followed by the cluster of duplicates 12 and 13. This cluster of duplicates 12 and 13 is notionally considered to be two keys: key  less than P, 12 greater than  and key  less than P, 13 greater than .
Even if the actual data processor time required is very small, some transactions may require a considerable amount of time as data must be retrieved from storage facilities and must be input by the user requesting the transaction. Accordingly, it is important that the database management system permit the data processor to interleave different transactions. A set of transactions are interleaved when some transactions of the set are performed between separated operations of other transactions in the set as multiple users or application programs share common resources. This, however, may lead to problems, as the results that a transaction yields may depend on the way in which the transactions are interleaved, and may change if the transaction is re-executed for any reason.
Consider two transactions regarding the above-described customer information table. The two transactions have operations that are being interleaved. A payment has been received from a customer A and a first record-adjustment transaction is decreasing the accounts receivable attribute and increasing the cash receipts attribute for A by the amount of the payment. Concurrently, a second asset-totaling transaction is calculating the total assets of the company including both total accounts receivable and total cash receipts. Both of these transactions are performed though a number of interleaved operations. Depending on how the operations of the transactions are interleaved, the second transaction may sum the accounts receivable attribute column before the amount paid is removed from the row corresponding to A and may sum the cash receipts column after the amount paid has been added to such row, resulting in the amount aid being summed twice and the total assets of the company being overstated by the amount paid.
The degree to which the results provided by a transaction may differ depending on the manner in which different transactions are interleaved depends on the isolation level of the system. Interleaving different transactions may impact on the results returned by a transaction in the following ways:
Lost updates: Transactions T1 and T2 both read data from the same record (row) and both update the same attribute (column) of such record. If T1 updates the attribute, and then T2 subsequently updates the same attribute based on T2""s read of the record prior to T1 updating the attribute, then T1""s update of the attribute will be lost.
Access to uncommitted data: Transaction T1 updates a value in a database and Transaction T2 reads that value before T1 commits. Subsequently, T1 rolls back. T2""s calculations are based on data that is no longer recorded and is presumably invalid.
Nonrepeatable reads: Transaction T1 reads a row, then performs other operations. After T1 has read the row, Transaction T2 changes the row. If T1 subsequently tries to read this row again, a different result will be returned then from the first time.
Phantom Read: Transaction T1 reads a set of rows based on some search criterion and then performs other operations. After T1 has read the set of rows, Transaction T2 updates existing rows or inserts new rows. T1 subsequently repeats the search and returns results that differ from those originally returned.
Different isolation levels block some or all of these potential problems. Two such different isolation levels are cursor stability (CS) and repeatable read (RR). RR locks all rows an application references within a transaction. No other transactions can update, delete or insert a row that would affect the results of the transaction that requested and received the lock. Accordingly, at RR isolation levels none of the above-described problems can arise. In contrast, at CS isolation levels any row accessed by a transaction is locked while the cursor is positioned on that row. This lock remains in effect until the next row is fetched. However, if any data in the row is changed, then this lock is held until the change is committed. Accordingly, at both the CS and RR isolation levels, only committed data is returned to a specific transaction unless the specific transaction itself has added the data; Further, if all transactions are run at RR isolation levels, then their concurrent executions are serializable in the sense that the same results follow whether the transactions are executed concurrently or serially. However, this is not the case with the CS isolation level as both nonrepeatable reads and phantom reads are possible at this isolation level.
Locks and latches are used to synchronize concurrent activities. Latches are used to guarantee physical consistency of data while locks are used to assure logical consistency of data. Locks are typically invoked by transactions, while latches are invoked by processes-in a single transaction there may be multiple processes. Accordingly, latches are usually held for a much shorter period of time than locks. Acquiring a latch is much cheaper than acquiring the lock as the latch control information is always in virtual memory in a fixed place, which is accessible given the name. In contrast, lock storage is dynamically managed and more instructions are required to acquire and release locks.
Lock requests may be conditional or unconditional. A conditional request means that the requestor is unwilling to wait if the lock is not immediately grantable when the request is processed. An unconditional request means the requestor is willing to wait until the lock becomes grantable. Locks may be held for different durations. An unconditional request for an instant duration lock means that the lock is not to be actually granted but the lock manager has to delay returning the lock call with the success status until the lock becomes grantable. In contrast, longer duration locks are released sometime after they are acquired and typically long before the transaction terminates. Commit duration locks are released only when the transaction terminates.
To provide the desired isolation level while at the same time permitting the concurrent execution of transactions, database management systems may include a lock manager module. The lock manager module maintains a lock table that indicates what resources are being accessed by different transactions. The type of lock maintained will depend on the type of transaction. Thus, the above-described asset-totaling transaction that determines the total assets of a company by totaling the accounts receivable column and the cash receipts column among others will request a share lock S from the lock manager to lock both of these columns as well as the columns of other tables pertaining to the company""s assets. This share lock does not prevent other transactions from accessing the locked resources, but does prevent the locked resources from being changed by, for example, insertion or deletion operations.
In contrast, the above-described record-adjustment transaction that adjusts the accounts receivable and cash receipts attributes of the record for customer A will request an exclusive lock X on at least those attributes of the record that are being changed. While a share lock S is compatible with other share locks, an exclusive lock X is incompatible with both other share locks and other exclusive locks. Accordingly, where the asset-totaling transaction has requested and received a share lock S on all of the attributes in the accounts receivable and cash receipts columns of the customer information table, the record-adjustment transaction""s request for an exclusive lock on the accounts receivable and cash receipts attributes of customer A will be denied, and the updating transaction will have to wait until the asset-totaling transaction releases the share lock on these attributes before being granted the exclusive lock X.
To perform the above-described asset-totaling transaction while maintaining RR, it is not sufficient to simply lock each of the attributes of the account receivable and cash receipts columns individually. For example, consider a case in which new customers are being added to the database. It is preferable that the asset-totaling transaction reflect either all of these new customers, or none of these new customers. Say, however, that the asset-totaling transaction places only individual share locks on the individual attributes in the accounts receivable and cash receipts columns of the customer information table. While the asset-totaling transaction is executing, a record-adding transaction is interleaved in which new customers are added to the customer information table, thereby adding new rows to the table, which new rows include values for accounts receivable and cash receipts. The insertion of these new accounts receivable and cash receipts values will not be impeded by the share lock S on the preexisting values in these columns, and, depending on how these two transactions are interleaved, some of the new accounts receivable and cash receipts entries may be included in the total assets calculated by the asset-totaling transaction, while other cash receipts and accounts receivable values added by the record-adding transaction are not. If the asset-totaling transaction subsequently recalculates the total assets, then it may return a different value, violating RR protocol.
To preserve the RR isolation level, database management system lock entities other than just records. For example, some database management systems may lock an entire file when a range is being scanned in that file. However, this solution significantly reduces the system concurrency. Alternatively, some database management systems employ range locking in which key value ranges are also listed as resources in the lock table. Accordingly, when a transaction includes a range scanning operation, the transaction will request a lock against this key value range before executing the operation. For example, before the above-described asset-totaling transaction scans all of the accounts receivable values, this transaction will request a lock against the entire column containing the accounts receivable values. The record-adding transaction must wait for this lock to be released before adding accounts receivable information regarding the new customer to the customer information table:
In order to lock key-value ranges, database management systems may rely on boundary key locking. Say an accounts receivable read transaction requests the RIDs of all records having accounts receivable attributes that exceed $1000. Before this forward scan starts traversing the index tree, it will not know what locks to obtain as it will not know what key values exist that exceed $1000. Accordingly, the locking for this forward scan must be postponed until a key value exceeding $1000 is found or it is determined that no customer owes more than $1000. As key values over $1000 are determined, share locks on these key values and a lock on the end of the file (EOF) for this index are requested conditionally, while a latch is held on each leaf page involved in the scan. Locks may not be immediately available as some of the key values located may be in an uncommitted state. If so, then the latch on the leaf page containing the key value is released and the lock is requested unconditionally. As each key having a key value over $1000 is eventually locked, the key value of such key is together with the key""s RIDs.
Say locks are obtained on the key values $1200 and $1450 as well as on the EOF for this index. With boundary key locking, these locks are really range locks covering disjoint ranges that are open at their lower ends and closed at their upper ends. The lock on the $1200 covers the range ($1000, $1200]; the lock on the $1450 covers the range ($1200, $1450]; and the lock on the EOF covers the range ($1450, EOF]. Accordingly, if, before this forward scan transaction commits, an insert transaction seeks to insert a key value exceeding $1000, then such key value will fall into one of these three locked ranges and cannot be inserted until the forward range scan commits and the locks are released. Specifically, say the insert transaction seeks to insert the key value $1300. Then the insert transaction will request and be denied an instant exclusive lock on the next key value, $1450, and will have to wait until the forward range scan commits and the locks are released. In contrast, say that before the forward range scan locates the key values exceeding $1000, a delete transaction has deleted the key value $1300, but has not committed. Then such delete transaction will have placed an exclusive lock of commit duration on the next key value $1450, which lock will prevent the forward scan transaction from obtaining a lock on the key value 1450, and thereby delay the forward scan transaction until the delete transaction commits.
No additional searching is required to locate the boundary key or upper bound for a forward scan as the forward scan of the index continues until either a value larger than the stop key (if there is one) or the EOF is reached. Accordingly, boundary key locking may be readily implemented as a means of range locking in forward scans. Sometimes, however, it will be advantageous for a database management system to permit reverse scans to be conducted of a reverse range defined by a selected upper boundary. With conventional database management systems it is necessary to do table scans, thereby losing the advantages associated with scanning an index. Alternatively, the index could be scanned in a forward direction and the results output to a temporary table. This temporary table could then be put in the opposite order and the results returned from that. While permitting the index to be used, this alternative approach is highly inefficient in terms of the number of steps required.
In order to permit reverse scans to be conducted at a selected isolation level, such isolation level must be maintained relative to forward transactions as well as relative to other reverse transactions. Further, such reverse scanning should be generally compatible with forward transactions and should be implemented as efficiently as possible. Thus, an improved database management system that permits reverse range scanning and integrates such scanning with the existing capabilities of database management system is desirable.
An object of one aspect of the present invention is to provide an improved scanning method in a data processing system.
In accordance with an aspect of the invention, there is provided a method executed by a data processor for scanning a reverse range. The scan is conducted in an index for a table having an upper end and a lower end. The reverse range has a start key value for defining the reverse range and the index has a set of keys representing a set of records in record attributes in the table. Each key in the set of keys has a RID designated in a record in the table and a key value corresponding to an attribute of the record in the table. The method comprises the steps of searching the index for a start key, selecting an upper bound of the reverse range, and, depending on whether the upper bound is the first key in the index, determining the contents of the reverse range. In the step of searching the index for a start key, the start key is selected to be a lowest key in the index having a start key value if the start key value is exclusive, and is selected to be a highest key in the index having the start key value if the start key value is inclusive. In the step of selecting an upper bound of the reverse range, the way in which the upper bound is selected depends on whether the start key is inclusive or exclusive, and on where the start key is or is not found relative to the index. Specifically, if the start key is in the index and is inclusive, then the upper end of the index is selected to be the upper bound if the start key value is a highest key value in the index, otherwise, the upper bound is selected to be the next higher key after the start key in the index. If the start key is in the index and is exclusive, then the start key is selected to be the upper bound. If the start key is not in the index and the start key value is lower than the lowest key value in the index, then the upper bound is selected to be a first key in the index. If the start key is not in the index and the lowest key value in the index is less than the start key value, then the upper bound is selected to be the upper end of the table if the start key value is higher than the highest key value in the index, otherwise the upper bound is selected to be the lowest key in the index having a key value exceeding the start key value. If the upper bound is the first key in the index, then the method indicates that the index does not contain any key values within the reverse range. On the other hand, if the upper bound is not the first key in the index, then each key below the upper bound in the index is fetched. Preferably, the method further comprises restricting access to the upper bound during scanning to preserve the selected isolation level.
In accordance with another aspect of the invention, there is provided a computer-program product for an application program for scanning a reverse range. The scan is conducted in an index for a table having an upper end and a lower end. The reverse range has a start key value for defining the reverse range and the index has a set of keys representing a set of records in record attributes in the table. Each key in the set of keys has a RID designated in a record in the table and a key value corresponding to an attribute of the record in the table. The method comprises the steps of searching the index for a start key, selecting an upper bound of the reverse range, and, depending on whether the upper bound is the first key in the index, determining the contents of the reverse range. In the step of searching the index for a start key, the start key is selected to be a lowest key in the index having a start key value if the start key value is exclusive, and is selected to be a highest key in the index having the start key value if the start key value is inclusive. In the step of selecting an upper bound of the reverse range, the way in which the upper bound is selected depends on whether the start key is inclusive or exclusive, and on where the start key is or is not found relative to the index. Specifically, if the start key is in the index and is inclusive, then the upper end of the index is selected to be the upper bound if the start key value is a highest key value in the index, otherwise, the upper bound is selected to be the next higher key after the start key in the index. If the start key is in the index and is exclusive, then the start key is selected to be the upper bound. If the start key is not in the index and the start key value is lower than the lowest key value in the index, then the upper bound is selected to be a first key in the index. If the start key is not in the index and the lowest key value in the index is less than the start key value, then the upper bound is selected to be the upper end of the table if the start key value is higher than the highest key value in the index, otherwise the upper bound is selected to be the lowest key in the index having a key value exceeding the start key value. If the upper bound is the first key in the index, then the method indicates that the index does not contain any key values within the reverse range. On the other hand, if the upper bound is not the first key in the index, then each key below the upper bound in the index is fetched.
In accordance with a further aspect of the present invention, there is provided a data processing system for scanning a reverse range. The scan is conducted in an index for a table having an upper end and a lower end. The reverse range has a start key value for defining the reverse range and the index has a set of keys representing a set of records in record attributes in the table. Each key in the set of keys has a RID designated in a record in the table and a key value corresponding to an attribute of the record in the table. The method comprises the steps of searching the index for a start key, selecting an upper bound of the reverse range, and, depending on whether the upper bound is the first key in the index, determining the contents of the reverse range. In the step of searching the index for a start key, the start key is selected to be a lowest key in the index having a start key value if the start key value is exclusive, and is selected to be a highest key in the index having the start key value if the start key value is inclusive. In the step of selecting an upper bound of the reverse range, the way in which the upper bound is selected depends on whether the start key is inclusive or exclusive, and on where the start key is or is not found relative to the index. Specifically, if the start key is in the index and is inclusive, then the upper end of the index is selected to be the upper bound if the start key value is a highest key value in the index, otherwise, the upper bound is selected to be the next higher key after the start key in the index. If the start key is in the index and is exclusive, then the start key is selected to be the upper bound. If the start key is not in the index and the start key value is lower than the lowest key value in the index, then the upper bound is selected to be a first key in the index. If the start key is not in the index and the lowest key value in the index is less than the start key value, then the upper bound is selected to be the upper end of the table if the start key value is higher than the highest key value in the index, otherwise the upper bound is selected to be the lowest key in the index having a key value exceeding the start key value. If the upper bound is the first key in the index, then the method indicates that the index does not contain any key values within the reverse range. On the other hand, if the upper bound is not the first key in the index, then each key below the upper bound in the index is fetched.