In computer systems that are capable of running multiple processes concurrently, the possibility exists for resources in the system to be accessed by more than one process at the same time. If not controlled, multiple concurrent accesses to the same resource may compromise the integrity of that resource. As used herein, the term "resource" refers to any object that can be accessed in the system. Examples of resources are files, shared memory regions, database tables or memory blocks.
To illustrate the possible problems inherent with concurrent access to shared resources, consider a multi-process database system. Process 1 begins a transaction that writes a value A to a data object, but prior to the completion of the entire transaction by Process 1, Process 2 performs a read on that data object. Process 1 then performs a further write that places a value B into the data object, and thereafter completes ("commits") the transaction. In this situation, the final version of the data object actually contains value B, not the value A read by Process 2. However, because Process 2 was allowed to read the data object before Process 1 had completed its writes, Process 2 erroneously believes that the data object contains value A. This "dirty read" has introduced a inconsistency into the system with respect to the value of the concurrently accessed object, thereby compromising the integrity of the data on the system.
To address situations such as this, various mechanisms are available to ensure the integrity of objects accessed by multiple processes. In particular, many database management systems (DBMS) utilize locking mechanisms to manage and coordinate access to shared objects on the system. In such systems, each process may be required to obtain a lock on an object before accessing an object. The type and parameters of the lock determine the scope of the access rights granted to the obtaining process. The appropriate grant of locks to objects ensures compatible access to the objects by concurrent processes.
To correct the dirty read example illustrated above, Process 1 may employ such a locking mechanism to exclusively lock the data object prior to making any writes. By doing so, Process 2 is blocked from accessing the same data object during the duration of Process 1's activities. Once Process 1 completes its work, the lock on the data object is released and Process 2 can thereafter access the final version of the data object, preventing the dirty read described above.
When implementing a locking mechanism, the granularity of the locks has an effect upon the performance of the system. As a general rule, the smaller the unit of resource covered by each individual lock, the less likely that the system will experience "false conflicts." A false conflict may occur, for example, when a first process holds a lock on an entire database table in order to update only row A of the table, while a second process seeks to concurrently update row B of the same table. Although the first process is only interested in updating row A, it has locked the entire table, thus preventing any other processes from updating other rows in the table. Hence, the second process would be blocked from performing its modification to row B. This type of false conflicts can dramatically reduce the efficiency and concurrency of the overall system, and could be prevented by forcing the first process to obtain a much finer lock covering only the specific row it seeks to modify. Consequently, many multi-tasking systems favor finer-grain locking over coarser grain locking to minimize false conflicts.
However, in many systems, there are only a finite number of locks available. Each open lock consumes a given amount of system memory and resources, and because of limited system resources, there is typically a practical limit to the number of locks that can be active at any one time. In such systems, maintaining an increased number of concurrently open finer-grained locks may not be feasible because of the increased system and memory requirements. Thus, coarser locks are necessarily employed, resulting in an increased risk of false conflicts.
This problem may occur, for example, in a distributed system where multiple distributed nodes concurrently access shared resources across the network. In many systems, if a process abnormally terminates while holding a lock on a resource, the lock is automatically de-allocated by the system. If the dead process made uncommitted changes while holding the lock, then the absence of an open lock on the resource (i.e., because of abnormal process death) may allow other processes to access an invalid resource value, introducing data inconsistencies into the system. Thus, the system should possess a mechanism to protect such data inconsistencies, by allowing other nodes and processes in the distributed system to recognize and identify shared resources that may be in an inconsistent state because of abnormal process termination.
One approach to address this problem is to employ a distributed lock manager to statically map and allocate open locks to cover all system objects at node startup time. The key to this approach is that each node opens a lock on all shared resources during node startup, and these locks do not release until node shutdown. In such a system, each of the distributed nodes hold locks on all the resources, and will notice and react to state changes when attempts are made to access a lock that is in abnormal state because of a node or process death. In this manner, the distributed system is protected from processes or nodes that abnormally die while holding locks on system resources.
The problem is that it is normally impractical to employ fine-granularity locking under this approach. If fine-granularity locks are employed, then a large number of locks are actively maintained at all times, resulting in the prohibitively increased consumption of system resources and memory. Thus, the scope of each lock is typically made relatively coarse to reduce the overall number of locks that must be maintained. However, the use of coarse-granularity locks increases the occurrence of false conflicts, leading to greater contention over shared system resources.
Therefore, there is a need for a system and method for locking in a computer system which allow fine-granularity locking, while protecting against data inconsistencies resulting from abnormal lock termination. There is further a need for a system and method for locking in a distributed system which allow fine-granularity locking while providing efficient handling of abnormal node and process deaths.