Database servers use resources while executing transactions. Even though resources may be shared between database servers, many resources may not be accessed in certain ways by more than one process at any given time. For example, resources such as data blocks of a storage medium or tables stored on a storage medium may be concurrently accessed in some ways (e.g. read) by multiple processes, but accessed in other ways (e.g. written to) by only one process at a time. Consequently, mechanisms have been developed which control access to resources.
One such mechanism is referred to as a lock. A lock is a data structure that indicates that a particular process has been granted certain rights with respect to a resource. There are many types of locks. Some types of locks may be shared on the same resource by many processes, while other types of locks prevent any other locks from being granted on the same resource.
The entity responsible for granting locks on resources is referred to as a lock manager. In a single node database system, a lock manager will typically consist of one or more processes on the node. In a multiple-node system, such as a multi-processing machine or a local area network, a lock manager may include processes distributed over numerous nodes. A lock manager that includes components that reside on two or more nodes is referred to as a distributed lock manager.
FIG. 1 is a block diagram of a multiple-node computer system 100. Each node is executing an instance of a database server and a portion of a distributed lock management system 132. Specifically, the illustrated system includes three nodes 102, 112 and 122 on which reside database servers 104, 114 and 124, respectively, and lock manager units 106, 116 and 126, respectively. Database servers 104, 114 and 124 have access to the same database 120. The database 120 resides on a disk 118 that contains multiple blocks of data. Disk 118 generally represents one or more persistent storage devices that may be on any number of machines, including but not limited to the machines that contain nodes 102, 112 and 122.
A communication mechanism allows processes on nodes 102, 112, and 122 to communicate with each other and with the disks that contain portions of database 120. The specific communication mechanism between the nodes and disk 118 will vary based on the nature of system 100. For example, if the nodes 102, 112 and 122 correspond to workstations on a network, the communication mechanism will be different than if the nodes 102, 112 and 122 correspond to clusters of processors and memory within a multi-processing machine.
Before any of database servers 104, 114 and 124 can access a resource shared with the other database servers, it must obtain the appropriate lock on the resource from the distributed lock management system 132. Such a resource may be, for example, one or more blocks of disk 118 on which data from database 120 is stored.
Lock management system 132 stores data structures that indicate the locks held by database servers 104, 114 and 124 on the resources shared by the database servers. If one database server requests a lock on a resource while another database server has a lock on the resource, then the distributed lock management system 132 must determine whether the requested lock is consistent with the granted lock. If the requested lock is not consistent with the granted lock, then the requester must wait until the database server holding the granted lock releases the granted lock.
According to one approach, lock management system 132 maintains one master resource object for every resource managed by lock management system 132, and includes one lock manager unit for each node that contains a database server. The master resource object for a particular resource stores, among other things, an indication of all locks that have been granted on or requested for the particular resource. The master resource object for each resource resides within only one of the lock manager units 106, 116 and 126.
The node on which a lock manager unit resides is referred to as the “master node” (or simply “master”) of the resources whose master resource objects are managed by that lock manager unit. Thus, if the master resource object for a resource R1 is managed by lock manager unit 106, then node 102 is the master of resource R1.
Changing the master of a lock resource from one node to another is referred to as “remastering” the lock resource. Various techniques have been developed for efficiently performing remastering operations. Such techniques are described, for example, in the following documents, the entire contents of which are incorporated herein:
U.S. Pat. No. 6,272,491, entitled “Method And System For Mastering Locks In A Multiple Server Database System”;
U.S. Pat. No. 6,529,906, entitled “Techniques For DLM Optimization With Re-Mastering Events”;
U.S. patent application Ser. No. 09/967,856, entitled “Techniques For Adding A Master In A Distributed Database Without Suspending Database Operations At Extant Master Sites”; and
U.S. patent application Ser. No. 10/832,527, entitled “Techniques for Multiple Window Resource Remastering Among Nodes of a Cluster”.
There are a variety of reasons that it may be useful to remaster a resource. For example, it costs significantly less for processes to interact with a lock management unit that resides on their same local node, than to interact with lock management units on other nodes. Therefore, one reason to remaster a resource is to establish the master of the resource to be the node that most frequently requests locks on the resource.
The access patterns of individual resources are often just a reflection of the access patterns of the larger sets of related resources to which the individual resources belong. In the context of a database server, for example, the access patterns of individual disk blocks may just be a reflection of the access patterns of the tables whose data is stored on those disk blocks.
For example, consider a database system where ten database servers are used to access a shared database. Assume further that the users that have permission to access a particular table of the database usually connect to the database through a particular database server. In this case, it is likely that the data blocks that store the rows of that particular table will be primarily accessed by the node on which that particular database server is running. Under these circumstances, it would be inefficient to have mastery of those data blocks spread evenly across all ten of the database servers.
To avoid such inefficiency, the particular node may be assigned to be the master of all data blocks of that particular table. Under these circumstances, lock operations involving that table would not incur the overhead of inter-node communication, as long as all access to the table was made through that particular database server.
Unfortunately, access patterns are not always simple. For example, assume that the users that have permission to access the particular table of the database usually connect to the database through three of the ten database servers. Under these circumstances, it would be inefficient to have those data blocks mastered evenly across all ten of the database servers. However, it may also be inefficient to have all of those data blocks mastered at only one of the three database servers. The greater the percentage of accesses that are made through the two database servers that are not the master, the greater the inefficiency of restricting mastery to the one database server.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.