1. Field of the Invention
This invention relates generally to system complexes embracing multiple concurrent multi-user database processing systems coupled to a shared external database store and particularly to a method for refreshing a stale Local Cache Buffer (LCB) data page that serializes stale page detection with re-reading of the fresh page from shared external nonvolatile store (NVS) to maintain multisystem cache coherency under both record and page locking granularities.
2. Description of the Related Art
Modern data processing system architecture provides for multiple Data Base Management System (DBMS) instances, each servicing many concurrent users, for processing a plurality of databases in a system complex (sysplex) that includes a plurality of Central Processing Complexes (CPCs) each having Local Cache Buffers (LCBs) commonly connected to a Shared Electronic Store (SES) that includes nonvolatile storage (NVS) means for storing data shared by the CPCs. The SES serves to "cache" data pages from a shared external store that includes stable Direct Access Storage Devices (DASDs) and the like. Because DASD data transfers are slowed by mechanical actuator latency, the SES caching function can significantly improve external store performance while offering complete data stability by virtue of its nonvolatile character. Such a multisystem environment is described in copending U.S. patent application Ser. No. 08/860,805 filed on Mar. 30, 1992 by D. Elko et al. as "SYSPLEX SHARED DATA COHERENCY METHOD AND MEANS", (Assignee Docket PO9-91-052), assigned to the Assignee hereof and entirely incorporated herein by this reference.
In the database sysplex described by Elko et al., wherein a plurality of independently-operating CPCs share data, global locking imposed by a global locking manager (GLM) is required to maintain data coherency within the different CPCs. The data coherency problem arises because sharing data among a proliferation of processors may create multiple inconsistent data page copies because of multiple paths to the data and because of opportunities to locally modify the data. The above-cited Elko et al. patent application describes a sysplex architecture in which each CPC operates with a storage hierarchy that may include a private high-speed hardware cache in each Central Processing Unit (CPU) of a CPC, a shared hardware cache accessible to all of the private CPU caches within a single CPC, a main store (MS) shared by all CPUs in a single CPC, a hardware storage area (HSA) within each CPC that is associated with the local MS but excluded from MS address space, an expanded store (ES) in each CPC coupled to the local MS and a DASD director coupled to the local MS for controlling DASD resources collocated with a single CPC.
A multiprocessor sysplex includes a plurality of such CPCs, where the various DASD directors operate to control data flow between all CPCs and all DASD resources so that any CPC can access any record on any DASD, including records written by other CPCs in the sysplex. Each CPC has one or more operating systems, where local CPC resources are logically partitioned among the operating system plurality. Within each CPC, the MS/ES storage combination is considered to be a single random access store internal to the CPC, where data pages in MS are backed by stable pages in ES and DASD in the usual manner. Some or all of the CPCs in a sysplex are connected to a Shared Electronic Store (SES), each by a channel connected to the corresponding local MS. In a hardware sense, a SES may be considered to be a large random access memory (RAM) that may be used (but need not be) in common by some or all CPCs connected to the SES. Connected CPCs may use the SES to store shared data records and pages on a temporary or semi-permanent basis. Thus, SES may be considered as a component of the storage hierarchy in the sysplex, having a hierarchy level common to all connected CPCs that roughly corresponds to the local ES level within each CPC.
A fundamental feature of this sysplex architecture is the use of SES as a high-speed cache for data normally stored in the sysplex common DASD resource even though the CPC-SES-DASD physical connection may not be organized as a direct hierarchy. Any CPC in the sysplex can access a record much faster from SES than it can from common DASD storage because the SES hardware speed is much faster than DASD access speed. Because SES includes nonvolatile storage (NVS) and a cache cross-invalidation capability, a DBMS instance can perform fast writes of modified ("dirty") LCB data pages to the SES and thereby satisfy the "force-to-stable-storage" requirement necessary for releasing global transaction locks. The modified data pages in SES may then be asynchronously destaged to DASD without holding up the release of global transaction locks, thereby increasing processing efficiency without compromising the normal database recovery features that rely on "stable storage" of updated data pages.
Practitioners have proposed other strategies for reducing global locking overhead in a multiprocess sysplex. For instance, in U.S. patent application No. 07/869,267 now U.S. Pat. No. 5,408,653 filed on Apr. 15, 1992 as "EFFICIENT DATABASE ACCESS USING A SHARED ELECTRONIC STORE IN A MULTISYSTEM ENVIRONMENT WITH SHARED DISKS", commonly assigned to the Assignee hereof and entirely incorporated herein by this reference, Josten et al. describe a protocol whereby, subject to a requirement for no intersystem read-write interest in the subject database, a single DBMS instance employs a "no-force-at-commit" protocol permitting it to write database updates to external storage asynchronously (e.g., in "batch" mode) after releasing global locks during transaction-commit processing. This flexibility improves overall transaction response time and reduces the global lock hold time, improving concurrency. However, the requirement for intersystem cache "coherency" requires a local buffer manager (BM) within a CPC to enforce a "force-at-commit" protocol whenever it detects intersystem read-write interest in the database being processed. The force-at-commit policy requires the DBMS instance to force (write) all modified (dirty) LCB data pages to stable external storage before releasing the committing transaction locks on those data pages. Moreover, before the global locks are released, all other "interested" systems must be notified of the forced data page updates through a cross-invalidation protocol controlled by the SES element of the sysplex. The Write-Ahead Logging (WAL) protocol is employed within each DBMS to ensure full database recovery from any conceivable combination of hardware failures.
The above-cited Elko et al. patent application describes a detailed system for ensuring data coherency in the Buffer Pool (BP) made up of the combination of all local cache buffers (LCBs) for a plurality of DBMS instances in all CPCs. Their method relies on data page storage and/or registration within SES of all LCB data page copies, among other important elements. To prevent multiple-version contamination (loss of coherency), any DBMS instance in a CPC wanting to access (read or write) a record in the sysplex common DASD pool must first register the data page containing such record in a SES directory and preferably read the data page from the SES cache if it exists there. Because each DBMS instance operates to maintain local cache coherency independently within the sysplex, the SES imposes global coherency controls that operate additionally to such internal coherency controls. SES causes each DBMS instance in a CPC to maintain Local Validity Vectors (LVVs) within the HSA of each CPC, each LVV corresponding to a LCB Of a DBMS. Each CPC sets a vector bit from a valid ("fresh") state to an invalid ("stale") state responsive to a SES message showing a change in the latest global version of the corresponding data page registered with SES, as part of the cross-invalidation procedure.
For instance, a DBMS instance in a CPC making the change to a record in a data page writes the changed page to SES. SES then consults a table to determine which other DBMS instances in other CPCs are holding a copy of the same data page in their LCBs and sends a message to each holding CPC to reset the corresponding bit in the LVV held in the HSA of the holding CPC. Importantly, because the HSA exists outside of the MS address space within the CPC, special hardware instructions are required to set, reset and test the LVV. If SES causes the CPC to set any vector bit to the invalid state in the CPC's HSA, the CPC program in execution is not disturbed or alerted in any way by such setting when it happens. CPC programming continues without interruption or notice. However, this means that each CPC program is independently responsible for testing the LVV bit state when necessary to determine if any data page invalidation has occurred. More importantly, the CPC program is also responsible for "serializing" such validity testing with other page operations, such as the refreshing of an invalid data page in LCB, to ensure that the subsequent operation actually reflects the supposed page "freshness" determined by the LVV bit test.
Each CPC controls the state of each LVV bit, corresponding to a data page in LCB, and normally sets this bit to "invalid" responsive to a message from SES signaling that an update to the same data page by some other CPC has made the LCB copy stale. Before permitting a transaction to read or write to a LCB data page, the CPC application programming tests the LVV validity bit and, responsive to finding an invalid setting, "refreshes" the LCB data page copy by sending a message to SES asking for the latest copy, which also results in registering with SES that the "fresh" LCB copy is cached in a DBMS instance in a CPC. It can be readily appreciated that the related message flow between each CPC and the SES must be serialized to avoid loss of data consistency of a cached data page in different instances of a DBMS.
For instance, because it is possible for a SES-supported CPC to receive and execute a cross-invalidate command against a designated LCB page after interest in the page has been registered at the SES but before the response to the read command under which the registration took place is received, the update to the LVV by the CPC application programming must be serialized with execution of the SES command. The application programming must ensure that the LVV is not set to "valid" (without a new refresh) after it is set to "invalid" by an intervening cross-invalidate command. Otherwise, invalidation of a LCB data page may be undetected and result in the loss of data integrity.
A problem arises because the LVV bit is reset to "valid" before the related "refresh" (read-and-register) request is sent by CPC to SES, a procedural requirement imposed to ensure that the local LVV bit entries never cause a data integrity problem within the CPC arising from "missed" cross-invalidate commands. The alternative method of resetting the LVV bit to "valid" after reading the page from SES is unacceptable as it will destroy the effect of an intervening cross-invalidate command from SES. By first setting the bit to "valid", the CPC ensures that an intervening cross-invalidate command from SES received during the refresh cycle is read and understood.
While this restriction ensures data coherency across the multisystem, it introduces a separate problem for concurrent transactions within a single DBMS instance in the CPC. That is, after the LVV bit is set to "valid" but before the completion of the data page refresh request is sent to SES, a second concurrent (local) transaction may ask for access to the same data page in LCB. When this second transaction queries the LVV bit, it erroneously finds the stale LCB data page copy to be "valid". When the first local transaction holds an exclusive page lock, this situation is not a problem. However, for record locking granularity or shared page locking, the second local transaction is not blocked from access to the same data page.
These cache coherence control problems do not exist in single multi-user systems. They arise as a result of the improved multisystem shared disk environment. The serialization and local coherence control procedures known in the art for preventing multiple-version data page contamination disadvantageously impose efficiency burdens that tend to negate much of the efficiency advantage offered by the sysplex multisystem shared external store architecture. The related unresolved problems and deficiencies are clearly felt in the art and are solved by this invention in the manner described below.