Field of the Invention
The present invention relates generally to data processing environments, and more particularly to on demand locking of retained resources in a distributed shared disk cluster data processing environment.
Background Art
Computers are very powerful tools for storing and providing access to vast amounts of information. Computer databases are a common mechanism for storing information on computer systems while providing easy access to users. A typical database is an organized collection of related information stored as “records” having “fields” of information. As an example, a database of employees may have a record for each employee where each record contains fields designating specifics about the employee, such as name, home address, salary, and the like.
Between the actual physical database itself (i.e., the data actually stored on a storage device) and the users of the system, a database management system or DBMS is typically provided as a software cushion or layer. In essence, the DBMS shields the database user from knowing or even caring about the underlying hardware-level details. Typically, all requests from users for access to the data are processed by the DBMS. For example, information may be added or removed from data files, information retrieved from or updated in such files, and so forth, all without user knowledge of the underlying system implementation. In this manner, the DBMS provides users with a conceptual view of the database that is removed from the hardware level. The general construction and operation of database management systems is well known in the art. See e.g., Date, C., “An Introduction to Database Systems, Seventh Edition”, Part I (especially Chapters 1-4), Addison Wesley, 2000.
In recent years, users have demanded that database systems be continuously available, with no downtime, as they are frequently running applications that are critical to business operations. Shared Disk Cluster systems are distributed database systems introduced to provide the increased reliability and scalability sought by customers. A Shared Disk Cluster database system is a system that has a cluster of two or more database servers having shared access to a database on disk storage. The term “cluster” refers to the fact that these systems involve a plurality of networked server nodes that are clustered together to function as a single system. Each node in the cluster usually contains its own CPU and memory and all nodes in the cluster communicate with each other, typically through private interconnects. “Shared disk” refers to the fact that two or more database servers share access to the same disk image of the database. Shared Disk Cluster database systems provide for transparent, continuous availability of the applications running on the cluster with instantaneous failover amongst servers in the cluster. When one server is down (e.g., for upgrading the CPU) the applications are able to continue to operate against the shared data using the remaining machines in the cluster, so that a continuously available solution is provided. Shared Disk Cluster systems also enable users to address scalability problems by simply adding additional machines to the cluster, without major data restructuring and the associated system downtime that is common in prior SMP (symmetric multiprocessor) environments that provide fast performance by making multiple CPUs available to complete individual processes simultaneously (multiprocessing).
In any database system, distributed or otherwise, data can be organized and accessed as “pages”. When data is brought from the disk into the main memory, “page” is the basic unit of access. Within the page, the data can be present as “rows”. For a transactional system, multiple transactions can be active on a single page at any point of time, each accessing a subset of rows within the page, when the system uses row-level locking.
In a distributed system such as shared disk cluster, transactional locks or logical locks are used for transactional consistency. These locks can either be page-level locks in which the entire page is locked, or row-level locks in which a particular row in a page is locked, or higher-level locks, such as table locks that are used to lock the entire table. These locks are held for relatively long duration, e.g., until the end of the transaction.
For physical consistency of the page, such as when multiple transactions are modifying different rows in the same page at the same time, physical locks, also called latches in popular SMP terminology, are used. These locks are held for relatively short duration, e.g., only for the time it takes to modify the data in the page in memory. With the help of physical locks, the physical operations on a particular page are serialized under typical conditions. Commonly, these locks can be acquired in “shared” mode, “exclusive” mode, or “null” mode, where a shared physical lock is compatible with other shared physical locks but incompatible with an exclusive physical lock, and an exclusive physical lock is incompatible with shared and exclusive physical locks but compatible with “null” physical locks.
In a distributed system, the physical locks are retained at each node until they are claimed by other nodes. The retention of the locks in this manner avoids unnecessary repeated acquisition cycles that might occur if the locks are released immediately. For physical consistency, often, a two-level lock is used. The first level is the inter-node synchronization where the cluster-wide “physical lock” is acquired and the next level is an intra-node synchronization where the “latch” is acquired. The cluster-wide physical lock gives the right of access to a particular node that has acquired the lock, while the “latch” gives the right of access to a particular task within that node that has the physical lock.
The access to the page, i.e., the latches as well as the physical locks, is granted on a “first come, first served” basis. For instance, if a task requests a shared, SH, latch and is granted the latch, a second task requesting for the latch in exclusive, EX, mode will be blocked and be queued in a wait queue. If a third task requests the latch in SH mode, it too will blocked, and be placed behind the second task requesting for the EX latch in the wait queue. The behavior for the physical lock is similar at the node-level.
This behavior is not always optimal. For example, in certain situations, the SH waiters in the wait queue behind the EX waiter might have been granted the SH latch, as this is compatible with the SH owner, and such granting could have increased concurrency. However, if such granting would occur, it is possible that the EX waiter would continually be waiting in the presence of the SH waiters.
In most database systems, a threshold is used to strike a balance between allowing more concurrency with SH waiters and not starving the access to the EX waiter. Generally, while the threshold has not been reached, SH waiters are able skip the EX waiter in the wait queue, and once the threshold has been reached, the SH waiters queue up behind the EX waiters. While such operations allow for some efficiency, a problem exists, since the “threshold” is determined by a lock manager in the system and may not be set to an optimal value based on actual request behavior, which can change rapidly and render a particular threshold bad for a particular situation.
Accordingly, a need exists for an approach to locking retained resources in a distributed system that avoids the limitations and shortcomings of prior approaches and provides more optimized control for physical access. The present invention addresses this and other needs.