This invention relates in general to the field of data processing and, in particular, to the duplexing of cache structures located within a coupling facility of a computing environment.
This application contains subject matter which is related to the subject matter of the following patents/applications which are assigned to the same assignee as this application. Each of the below listed patents/applications is hereby incorporated herein by reference in its entirety:
xe2x80x9cCastout Processing For Duplexed Cache Structuresxe2x80x9d, Elko et al., Ser. No. 09/255,383, filed herewith;
xe2x80x9cMethod And System For Reconfiguring A Storage Structure Within A Structure Processing Facility,xe2x80x9d Allen et al., U.S. Pat. No. 5,515,499, Issued May 7, 1996;
xe2x80x9cMultiple Processor System Having Software For Selecting Shared Cache Entries Of An Associated Castout Class For Transfer To A DASD With One I/O Operation,xe2x80x9d Elko et al., U.S. Pat. No. 5,493,668, Issued on Feb. 20, 1996;
xe2x80x9cSoftware Cache Management Of A Shared Electronic Store In a Supplex,xe2x80x9d Elko et al., US. Pat. No. 5,457,793, Issued Oct. 10, 1995;
xe2x80x9cMethod, System And Program Products For Managing Changed Data Of Castout Classes,xe2x80x9d Elko et al., Ser. No. 09/251,888, Filed: Feb. 19, 1999;
xe2x80x9cSysplex Shared Data Coherency Method,xe2x80x9d Elko et al., U.S. Pat. No. 5,537,574, Issued Jul. 16, 1996;
xe2x80x9cMethod And Apparatus For Coupling Data Processing Systemsxe2x80x9d Elko, et al. U.S. Pat. No. 5,317,739, Issued May 31, 1994;
xe2x80x9cIn A Multiprocessing System Having A Coupling Facility, Communicating Messages Between The Processors And The Coupling Facility In Either A Synchronous Operation Or An Asynchronous Operationxe2x80x9d, Elko et al., U.S. Pat. No. 5,561,809, Issued on Oct. 1, 1996;
xe2x80x9cMechanism For Receiving Messages At A Coupling Facilityxe2x80x9d, Elko et al., U.S. Pat. No. 5,706,432, Issued Jan. 6, 1998;
xe2x80x9cCoupling Facility For Receiving Commands From Plurality Of Hosts For Activating Selected Connection Paths To I/O Devices And Maintaining Status Thereofxe2x80x9d, Elko et al., U.S. Pat. No. 5,463,736, Issued Oct. 31, 1995;
xe2x80x9cA Method And System For Managing Data and Users of Data in a Data Processing System,xe2x80x9d Allen et al., U.S. Pat. No. 5,465,359, Issued on Nov. 7, 1995;
xe2x80x9cShared Access Serialization Featuring Second Process Lock Steal And Subsequent Write Access Denial To First Processxe2x80x9d Insalaco et al, U.S. Pat. No. 5,305,448, Issued on Apr. 19, 1994;
xe2x80x9cMethod Of Managing Resources In One Or More Coupling Facilities Coupled To One Or More Operating Systems In One Or More Central Programming Complexes Using A Policy,xe2x80x9d Allen et al., U.S. Pat. No. 5,634,072, Issued On May 27, 1997;
xe2x80x9cPartial Page Write Detection For A Shared Cache Using A Bit Pattern Written At The Beginning And End Of Each Pagexe2x80x9d; Narang et al., U.S. Pat. No. 5,455,942, Issued Oct. 3, 1995;
xe2x80x9cMethod For Managing Database Recovery From Failure Of A Shared Store In a System Including A Plurality Of Transaction-Based Systems Of The Write-Ahead Logging Typexe2x80x9d, Narang et al., U.S. Pat. No. 5,280,611, Issued Jan. 18, 1994; and xe2x80x9cMethod And Apparatus Of Distributed Locking For Shared Data, Employing A Central Coupling Facilityxe2x80x9d, U.S. Pat. No. 5,339,427, Issued Aug. 16, 1994.
A cache structure is a high-speed cache shared by one or more independently-operating computing units of a computing environment. In particular, cache structures are located within a remote facility, referred to as a coupling facility, that is coupled to the one or more independently-operating computing units. The computing units store and retrieve data from the cache structures.
Coupling facility cache structures can be configured in several different modes of operation, one of which is a store-in mode. Store-in mode caches are used, for example, by the DB2 database management facility of International Business Machines Corporation. A key attribute of the store-in mode is that changed data may be stored into the non-volatile memory of the coupling facility using the high performance coupling facility links. This avoids the delay in the execution of database transactions that result when the data is written to secondary storage (e.g., direct access storage devices (DASD)) using normal input/output (I/O) operations, and is an advantage of the coupling facility cache.
Subsystems who cache changed data in a coupling facility cache face a unique recovery/availability problem, which is not faced by those who either do not cache data or cache only unchanged data. For example, when a data item is modified and only written changed to the coupling facility cache structure, a subsequent failure of the coupling facility cache structure can cause the only existing current level of the data item to be lost. This results in a loss of data integrity. This loss of integrity window exists from the time the data item is written to the coupling facility cache until it is eventually castout to permanent storage, which may be a considerable time. At any given instant, a significant percentage of data stored in the coupling facility cache structure may be in this changed state, and thus vulnerable to loss should the coupling facility structure be lost.
To recover from such failures, subsystems have made use of recovery logs, which are hardened on permanent storage. Basically, during normal operation, as a given subsystem instance modifies a data item, it first writes a description of the data item update to its own recovery log along with a unique ordering indication (typically, a timestamp) showing when the update to the data was made relative to the other updates. Then, when the log update is complete, it writes the updated data item to the coupling facility cache structure. Given this, if the cache structure fails, a recovery process can reconstruct the most current version of the data by merging the recovery logs of all subsystem instances so that updates made by all instances can be observed; locating the most current copy of each data item in the log, using the ordering information associated with each of the logged updates; and writing the most current copy of each of the data items to permanent storage.
While the above approach allows the data to be recovered following the failure of a coupling facility cache structure, it is not an adequate solution for providing continuous availability of the shared data and of the coupling facility cache structure across such failures. The log merge and recovery update processing can take a long time, during which time the database is entirely unavailable for use by end users.
Thus, a need exists for a recovery technique that allows recovery from a failure with little or no perceived unavailability of the data to the end users. A further need exists for a mechanism that allows selected data to be duplexed. A yet further need exists for a mechanism that allows duplexing to be turned on and off automatically. A yet further need exists for a technique that enables a switch from duplex mode to simplex mode to be performed quickly.
The shortcomings of the prior art are overcome and additional advantages are provided through the provision of a duplexing method. In one embodiment, the duplexing method includes writing data to a primary instance of a data structure; and selectively writing a portion of the data to a secondary instance of the data structure, wherein the secondary instance is usable as a copy of the primary instance, but contains less data than the primary instance.
The duplexing capability of the present invention advantageously provides for improved availability of data, such as cache structure data. Duplexing can be initiated on a per-structure basis, either manually or automatically. Once duplexing is initiated, the operating system drives the structure users to temporarily quiesce access to the structure; allocate a secondary structure instance in, for example, a different coupling facility from the primary structure instance; copy any necessary structure data from the primary instance to the secondary instance, establishing a duplexed copy of the structure data; and unquiesce access to the structure with duplexing established.
Once duplexing is established, the user explicitly duplexes any necessary updates to both the primary and secondary structure instances to maintain synchronization.
When a structure failure or loss of connectivity affects one of the structure instances, the operating system drives the structure users to revert to simplex mode on the unaffected structure instance. The switch to simplex mode is very fast, with no data loss and no log recovery needed. Duplexing may then be reinitiated for the structure either automatically or manually.
At the time changed data in the cache structure is castout causing the data entries to be marked unchanged, the present invention advantageously deletes the entries from the secondary structure.
Additional features and advantages are realized through the techniques of the present invention. Other embodiments and aspects of the invention are described in detail herein and are considered a part of the claimed invention.