The present invention relates to information systems, and more specifically, to an approach for providing access to resources in information systems.
In information systems, processes require access to resources to perform work. As used herein, the term xe2x80x9cresourcexe2x80x9d refers to any object that can be accessed in an information system. In a hardware context, examples of resources include, but are not limited to, printers, disk drives and memory. In a software context, examples of resources include, but are not limited to, data items and routines. In information systems it is often desirable to allow only one process at a time to access a particular resource to maintain consistency. For example, if while a first process is updating a particular resource, a second process is allowed to read from the particular resource, then the second process may read an intermediate value of the particular resource that is different than the final value of the particular resource after the first process has completed its updates. As a result, the second process may erroneously believe that at the time the second process read from the particular resource, that the particular resource reflected all of the changes made by the first process. In this situation, an inconsistency has been introduced into the information system with respect to the value of the concurrently accessed particular resource, thereby compromising the integrity of the data on the information system.
To address situations such as this, various mechanisms are used to ensure the integrity of resources that can be accessed by multiple processes. One type of mechanism used to coordinate access to shared resources is referred to as a lock. In its most general form, a lock is data that indicates that a particular process has been granted certain rights with respect to a resource. There are many types of locks. The type and parameters of a lock determine the scope of the access rights granted to the process that holds the lock. For example, shared locks may be shared on the same resource by many processes, while exclusive locks are only granted to a single process and prevent other locks from being granted on the same resource.
In the previous example, the first process would request and be granted a lock on the particular resource prior to updating the particular resource. While the first process holds the lock on the particular resource, the first process may make updates to the particular resource, but the second process may not access the particular resource. Once the first process has completed updating the particular resource, the lock is released. The second process can then access the particular resource and see all of the updates made to the particular resource by the first process.
Locks are typically allocated and managed by an entity referred to as a lock manager. Lock managers are responsible for maintaining data that specifies the status of locks for a set of resources. Lock managers sometimes also maintain a lock request queue that specifies pending lock requests for resources managed by the lock managers. In distributed computing systems, lock managers and their assigned resources may reside on different nodes, or may be located on the same node. As used herein, the term xe2x80x9cnodexe2x80x9d refers to any type of computing entity. Thus, a single computer may either be a single node or support multiple nodes.
One of the problems with locking mechanisms is that when a particular lock manager fails, or the node on which the particular lock manager resides fails, the lock data maintained by the particular lock manager can be corrupted or lost. Specifically, the data that specifies the status of locks for the particular resources assigned to the particular lock manager and the lock request queue data can be corrupted or lost.
One of the consequences of this type of failure is that it is not immediately known which resources associated with the lost lock data were locked by processes at the time of the failure. As a result, no new locks can be granted on the resources until the lock data is re-built which can cause undesirable processing delays. These resources are sometimes referred to as xe2x80x9cfrozenxe2x80x9d resources.
FIG. 1 is a block diagram of a distributed computing arrangement 100 on which a conventional locking mechanism is employed. Arrangement 100 includes three nodes, identified as NODE1, NODE2, NODE3 and a resource R1. A lock manager LM resides on NODE2 and is responsible for managing access to resource R1. Specifically, lock manager LM grants locks to processes that require access to resource R1. Lock manager LM may also be responsible for managing access to other resources (not illustrated).
Lock manager LM generates and maintains lock data 102 on NODE2. Lock data 102 contains data that specifies what processes, if any, have been granted locks on resource R1. Lock data 102 also contains data that specifies any pending lock requests for resource R1. When a change in lock status for resource R1 occurs, for example when a new lock is granted or when an existing lock is released, lock manager LM causes lock data 102 to be updated to reflect the change in lock status.
Suppose process P1 requires access to resource R1. Process P1 requests a lock on resource R1 from lock manager LM. Lock manager LM determines whether any other processes currently hold locks on resource R1 that are incompatible with the lock requested by process P1. If no other processes currently hold locks on resource R1 that are incompatible with the lock request from process P1 and if there are no other pending lock requests for resource R1, then lock manager LM grants a lock on resource R1 to process P1 by updating lock data 102 and notifies process P1 that a lock on resource R1 has been granted to process P1. Once process P1 receives the lock grant notification, process P1 can access and update resource R1. When process P1 no longer requires access to resource R1, process P1 notifies lock manager LM accordingly. Lock manager LM then releases the lock on resource R1 to process P1 by updating lock data 102.
Suppose that while the lock on resource R1 is granted to process P1 that lock data 102 is lost or otherwise becomes unusable, for example by a failure of lock manager LM or a failure of NODE2. In this situation, the resource management duties previously performed by lock manager LM are assigned to another lock manager (not illustrated) and lock data 102 must be re-generated. Determining which processes had locks on resource R1 is usually obtained from the processes themselves. For example, this may be accomplished by broadcasting a message to all processes that a failure has occurred and requesting processes that held locks on resources managed by lock manager LM to inform the new lock manager about the locks they held. In addition, the new lock manager will need to know which processes had pending requests for a lock on resource R1 in a lock request queue of lock data 102. If the lock requests cannot be determined from the processes, then processes will have to re-submit lock requests on resource R1 to the new lock manager. During the time that lock data 102 is being re-generated, resource R1 is frozen, meaning that no new locks are granted on resource R1.
In situations where lock manager LM was responsible for managing access to a large number of resources, re-generating lock data 102 can require a large amount of time and system resources. Moreover, all resources managed by lock manager LM must be frozen during recovery since it is not known whether any processes held locks on those resources. This can have a significant adverse affect on other processes that require access to any resources that were managed by lock manager LM. Furthermore, in situations where a determination cannot be made as to which processes held locks on resource R1 at the time of the failure, the processes that held locks on resource R1 will have to re-submit a request for a lock on resource R1 to the new lock manager and the changes made by those processes will be lost.
Therefore, based on the need to control access to resources by processes and the limitations in the prior approaches, an approach for controlling access to resources that does not suffer from limitations inherent in conventional locking approaches is highly desirable.
According to one aspect of the invention a method is provided for managing access to a particular resource. According to the method, a request is received for a lock on the particular resource from a first process residing on a first node. Lock data is generated on a second node that indicates that a lock on the particular resource has been granted to the first process. In addition, duplicate lock data is generated on a third node that indicates that the lock on the particular resource has been granted to the first process.
According to another aspect of the invention, a computer system is provided that comprises a resource and a locking mechanism for managing access to the resource. The locking mechanism is configured to generate, in response to a lock request from a first process, lock data that indicates that a lock has been granted on the resource to the first process. The locking mechanism is also configured to generate duplicate lock data that indicates that the lock has been granted on the resource to the first process.
According to another aspect of the invention, a distributed computing system is provided that comprises a first node, a second node communicatively coupled to the first node and a locking mechanism residing on the first node. The locking mechanism is configured to manage access to a particular resource by granting locks on the particular resource. The locking mechanism is also further configured to generate lock data indicative of locks granted on the particular resource. The locking mechanism is further configured to cause duplicate lock data to be generated on the second node, wherein the duplicate lock data is indicative of locks granted on the particular resource.