The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
Multiple processes running on multi-processing systems may access shared resources, such as disk blocks. Some of these shared resources may be accessed by only one process at a time, while others may be accessed concurrently by multiple processes. Consequently, “synchronization mechanisms” have been developed to control access by multiple processes to shared resources. The synchronization mechanism grants locks to processes. Locks grant to holders of the locks the right to access a particular resource in a particular way. Once a lock is granted to a process, the process holds or owns the lock until the lock is relinquished, revoked, or otherwise terminated. Locks are represented by data structures such as semaphores, read/write latches, and condition variables. There are many types of locks. Some types of locks allow shared resources to be shared by many processes concurrently (e.g. shared read lock), while other types of locks prevent any type of lock from being granted on the same resource (exclusive write lock).
The entity responsible for granting locks is referred to herein as a lock manager. In a single node multi-processing system, a lock manager is typically a software component executed and invoked by processes on the node accessing a shared resource.
In contrast to a single node system, a multi-node system consists of network of computing devices or “nodes, each of which may be a multi-processing system. Each of the nodes can access a set of shared resources. Multi-node systems use synchronization mechanisms, referred to as global synchronization mechanisms, to control access to the set of shared resources by nodes in the multi-node system.
A global lock mechanism includes a global-lock manager that is responsible for issuing locks to processes on the multi-node system. In order for a node to access a shared resource, it is granted a “global lock” by a global lock manager. A global lock is a lock that can be granted by a global lock manager on a node in a multi-node system to one or more processes on another node to coordinate access to the shared resources among the processes executing on any node in a multi-node system.
A type of global lock manager is a distributed lock manager, which is comprised of local lock managers that are distributed on the nodes of a multi-node system, with one or more of the local lock managers running on each node in a multi-node system. Each lock manager is responsible for coordinating the global locks for processes on the local lock manager's node. A local lock manager is referred to as the local lock manager with respect to the node on which it resides; the node and a process running on the node are referred to as a local node and local process with respect to the local lock manager and the node.
A local lock manager residing on a node issues global locks to lock managers on the other nodes and to processes running on the same node as the local lock manager. A process needing a global lock on a resource managed by a non-local lock manager requests the global lock from its local lock manager. If a local lock manager already holds a compatible global lock, the local lock manager issues a global lock to the local process. If the local lock manager does not hold a compatible global lock, the local lock manager first obtains one from the non-local lock manager. Once obtained, the local lock manager issues the global lock to the local process.
For convenience of expression, the global locks issued by local lock managers to local processes are referred to herein as local locks. Thus, a local lock manager obtains a global lock from another lock manager and issues compatible local locks to local processes.
Also, for convenience of expression, nodes are described herein as performing actions and as being the object of actions. However, this is just a convenient way of expressing that one or more processes on a node are performing an action or is the object of an action. For example, a lock manager requesting, obtaining, and issuing a global lock or local lock may be described as a node requesting, obtaining, and issuing a global lock or local lock.
Acquiring global locks can be more expensive to acquire than acquiring only local locks. This is because a global lock may entail inter-node communication and interaction between a local lock and a local lock manager on another node.
Such interaction can entail a particularly expensive form of an operation referred to as a ping. A ping occurs when the version of a resource that resides in the cache of one server must be supplied to the cache of a different server. Thus, a ping occurs when, after a node A modifies resource x in its cache, another node B requires resource x.
Cache Fusion
One way of performing a ping is referred as cache fusion. Transferring cache copies of a resource between nodes is performed to speed up locking mechanisms. FIG. 1 is a block diagram that illustrates a multi-node system 101 and a cache fusion protocol for requesting and transferring cached resources, according to an embodiment of the invention. Nodes in system 101 may communicate directly with each other or via a network, such as a LAN, or the Internet. In order to acquire a global lock on a shared resource, the cache fusion protocol begins when a requesting node 104 requests a lock on a particular shared resource (step 112) from a master node 102 where the lock manager for the particular shared resource resides.
Master node 102 receives the request and determines whether any other node holds an incompatible lock on the shared resource. In the simple case where no node holds an incompatible lock on the shared resource, the master node grants the lock directly to the requesting node. If the master node itself holds an incompatible lock on the shared resource, then the master node will eventually grant a lock on the shared resource directly to the requesting node. Otherwise, another node (i.e., a holding node 106) holds an incompatible lock on the shared resource.
The master node sends a message to holding node 106 (step 114) indicating that requesting node 104 requests a lock on the particular shared resource for which holding node 106 holds the lock. Holding node 106 grants the lock and may send a copy of the shared resource directly to requesting node 104 (step 116). In some cases, even a compatible lock held by holding node 106 on the shared resource (e.g., requesting node 104 requesting a shared lock on the resource and holding node 106 holds a shared lock on the resource) will trigger an interconnect message from holding node 106 to requesting node 104 because holding node 106 may hold a dirty, or modified, version of the shared resource.
Lastly, once requesting node 104 receives the shared resource and the lock, requesting node 104 notifies master node 102 (step 118) that requesting node 104 has the lock on the shared resource. Therefore, each request for a lock on a shared resource may cause four inter-node messages to be generated. Techniques are thus needed to reduce the cost of acquiring global locks.
One technique to reduce the cost of acquiring global locks is to use a “mastering technique” that assigns a master node to a subset of shared resources based on patterns of access to shared resources. (A master node for a shared resource governs access by other nodes to the shared resource.) For example, if most of the accesses to a portion of shared data are performed by a particular node, that node is assigned as the master node for that portion of the shared data. This reduces the messaging overhead between nodes because less global locks will have to be acquired since the particular node demanded most of the accesses to the portion of shared data. Future accesses to the portion of shared data will only require the granting of local locks with respect to the particular node. However, mastering does not eliminate the cost of executing more instructions to acquire a global lock.
Another technique to reduce the cost of acquiring global locks is to use coarse-grain locking. In this scheme, locks are acquired at a higher level of granularity, such as a table or file, instead of a finer level of granularity, such as a row or a disk block. When a lock is acquired at the higher level of granularity, it is implicitly granted for levels of shared data at a finer level of granularity. For example, if a global lock is acquired for an entire table, individual global locks for the rows or blocks for the table are implied and do not have to be acquired, avoiding the cost of obtaining a global lock for each row and block.
The advantage of this technique is that it does not depend on the assignment of a master node. A significant disadvantage, however, is that this technique can lead to false contention. Specifically, if a node needs to modify a row in a table that has been locked by another node in a conflicting mode, that node must relinquish the lock on the table although the two nodes may be accessing different rows or even different blocks.
Another technique to reduce the cost of acquiring global locks is to use hierarchical locking. In this scheme, locks are first acquired at a higher level in the hierarchy, such as a table. If a global lock is acquired at a higher level in the hierarchy, global locks are implicitly granted at the lower level of the hierarchy. When another node subsequently needs to access data in the lower level of the hierarchy, such as a row or a block, in a conflicting mode, the first node de-escalates its lock and acquires locks at the lower level in the hierarchy.
The disadvantage of this technique is that the cost of obtaining a global lock is inflated and shifted to the requesting node whose lock request triggers the de-escalation. To honor the request, work is performed to acquire global locks for all the shared data at the lower level of the hierarchy. This work is performed despite the requesting node having requested a lock on only a small portion of the shared data.
As clearly shown, techniques are needed to reduce the cost of acquiring global locks that avoid the pitfalls attendant to techniques described above for reducing the cost of global locks.