In computer systems that are capable of running multiple processes concurrently, the possibility exists for data to be accessed by more than one process at the same time. If not controlled, concurrent accesses to the same data may compromise data integrity. Thus, various mechanisms have been developed to ensure data integrity.
One such mechanism requires each process to obtain a lock on a particular block of data before accessing any data within the block. As used herein, the term "block" refers to any set of data, regardless of whether each item in the set of data corresponds to a particular table, row or object. When a block is locked, only the locking process can access the data within the block. Once the process holding the lock completes, the process releases the lock and another process can access the data by obtaining the lock.
When implementing a locking mechanism, the granularity of the locks has an effect on how the system functions. As a general rule, the smaller the amount of data covered by each individual lock, the less likely that the system will experience "false conflicts". A false conflict may occur, for example, when a first process is holding a lock on a table in order to update row A of the table, while a second process requests a lock in order to update row B of the table and is prevented from doing so because the entire table is locked by the first process. Consequently, most applications running on a multitasking computer system, such as a database application, favor fine-grain locking (e.g., locking individual rows) over coarse-grain locking (e.g., locking entire tables).
However, in many multiprocessing computer systems, a finite number of locks are available. In such systems, it is possible to run out of locks when multiple processes are performing many updates using fine-grain locks. To prevent running out of locks, many such systems include a lock granularity escalation mechanism that convert fine-grain locks to coarser-grain locks when the number of locks available within the system is low. For example, a process may hold row locks on each of five rows of a particular table. During the lock escalation process, the process would obtain a single lock on the entire table and release the five locks it previously possessed. The released locks increase the number of locks available in the pool of locks used by the various processes in the system. Unfortunately, lock escalation to coarser-grain locks also increases the likelihood of false conflicts, as described is above.
A technique similar to lock escalation, referred to as lock de-escalation, has been used in a lock manager (the "VMS lock manager") developed by Digital Equipment Corporation and described in Adaptive Locking Strategies in a Multi-Node Data Sharing Environment, A.M. Joshi, Digital Equipment Corporation, Database Engineering Group, which is incorporated herein by reference. In the VMS lock manager, a set of objects may be organized into a tree-structured granularity hierarchy. When a process requires access to a resource, the process obtains a "strong lock" at the root of a resource hierarchy. This lock implicitly covers all objects that are descendants of the root object. When there is a conflict at the root of the hierarchy, the process downgrades its lock on the root to a "weak" lock, and obtains a "strong" lock at the next level in the hierarchy tree. This process can continue until either there is no lock conflict or the "strong" lock is at the leaf level of the hierarchy.
The above referenced escalation and de-escalation techniques are known for tree-structured granularity hierarchies and are typically implemented in conjunction with a locking technique known as "range locking". In range locking, the tree-structured hierarchy works much like a B-tree type index. A distinguishing characteristic of a known range locking scheme (whether or not escalation or de-escalation techniques are employed) is that ranges of resources are locked together in contiguous sets. For example, a range locking scheme would lock all the records in a table when acquiring the broadest scope of lock, whereas the narrowest scope of lock would be a lock on a single record. In between, groups of arbitrary sizes of records may be locked as a "range" with a single lock, as opposed to more fine-grained locks on individual records. In other words, the first ten records of the table may be locked, or the last ten records, or a set of ten contiguous records somewhere in between the first and last records of the table.
FIG. 1 depicts a table 100, wherein the table comprises two columns, a row ID column 102 and a data column 104. Also depicted in FIG. 1 is a tree-type lock 150. The tree-type lock 150 comprises three levels of locks in the hierarchy. The first range 152 represents a coarse-grain lock on all of the records in the table, specifically records identified by row IDs 0 through 11. The second and third ranges, 154 and 156 respectively, represent a finer granularity of lock (but not the finest) which locks row IDs 0 through 5 and 6 through 11 respectively. The finest granularity locks are the row level locks, e.g., locks 158 and 160.
A problem with range locking techniques, whether or not they employ lock escalation, lock de-escalation, or both, is that a particular resource on a disk may get a high amount of activity relative to other resources. The resource is, consequently, called a "hotspot". For example, numerous transactions might read and then write to the hotspot frequently. When multiple processes are continually accessing the same hotspot on the disk (and there may be many such hotspots), there may be significant contention for the same set of resources, and the propensity for false conflicts and deadlocks will be high, specially in a range locking escalation or de-escalation scheme.
For example, assume that the records with row IDs 0 and 2 contain data that is frequently updated. A first process that updates the record with row ID 0 might lock range 154, which will lock records with row IDs 0, 1, 2, 3, 4 and 5 in table 100, even though the first process is only updating the record associated with row ID 0. A second, concurrent process may request to lock the record identified by row ID 1. A false conflict results because the first process was implicitly holding a lock on row IDs 0 through 5. In a lock de-escalation scheme, the lock on range 154 would be downgraded to an individual, explicit lock on row ID 0. Consequently, the second process can now obtain a lock on row ID 1.
Because the records associated with row IDs 0 and 2 are hotspots, the need to perform this same downgrade operation may occur nearly every time a lock is obtained on range 154. Performing such downgrades increases the overhead associated with updating records. Thus, it may ultimately be more efficient to forego any use of range locks rather than suffer the penalty associated with having multiple hotspots covered by the same range lock.
The locality of reference for hotspots is often high, which means that resources in the hotspot are all within a relatively close proximity to each other. In such a circumstance, a tree-type locking escalation or de-escalation scheme is inefficient. Locks will be escalated or de-escalated continually and the issue of false conflicts and deadlocks will have to be addressed at the further expense of system resources and processing time. Further, a tree-type locking hierarchy usually necessitates a number of different modes of locks and at a number of levels of granularity. For example, a lock is typically taken at each level of the tree from the root node (e.g., the coarse-grain lock range 152) clear down to the desired granularity (e.g., range 154 and finally lock 158), which requires multiple locks to isolate a single resource.
Thus, there is a need in the art for more efficient locking escalation and de-escalation technique.