A cache structure is a high-speed cache shared by one or more independently-operating computing units of a computing environment. In particular, cache structures are located within a remote facility, referred to as a coupling facility, that is coupled to the one or more independently-operating computing units. The computing units store and retrieve data from the cache structures.
Coupling facility cache structures can be configured in several different modes of operation, one of which is a store-in mode. Store-in mode caches are used, for example, by the DB2 database management facility of International Business Machines Corporation. A key attribute of the store-in mode is that changed data may be stored into the non-volatile memory of the coupling facility using the high performance coupling facility links. This avoids the delay in the execution of database transactions that result when the data is written to secondary storage (e.g., direct access storage devices (DASD)) using normal input/output (I/O) operations, and is an advantage of the coupling facility cache.
The store-in mode requires, however, that the database program provide background tasks, called castout processes, to periodically write the changed data, using batched I/O operations to the database locations on DASD. In order to enhance the performance of the I/O operations, the changed data are grouped by physical DASD volume into castout classes.
The castout processes provide two key functions: First, they ensure that the cache does not become full of changed data, and thus, keeps the percentage of changed versus unchanged data in the cache at or below certain thresholds. Second, the castout processes ensure that the amount of time a database page can reside in the cache in the changed state is bounded by some predetermined value. This bound is critical in those cases in which the coupling facility fails and database logs are needed to recover the lost data. The bounding of the amount of time a page can remain changed in the cache, bounds how far back in time the logs must be processed, and thus, bounds the amount of time needed to perform the recovery operation. Bounding this recovery time is critical in meeting the availability goals of the database.
In order to satisfy the bounds, it is important for the castout processes to be able to efficiently determine the age of the oldest changed page in the cache. This determination cannot be based solely on the arrival pattern of the requests, since the timestamps associated with the transactions and stored in the database log do not correspond to the arrival of the data at the coupling facility. They occur at different times. Also, the coupling facility services numerous coupling facility links in parallel, and so the sequence at which requests arrive at the facility does not correspond to the sequence in which they are executed. Thus, more information is required. This additional information is provided in an object associated with each changed page written to the coupling facility, called the user data field (UDF). The UDF object contains the timestamp that the database associates with the changed page. The UDF object is stored in a directory entry associated with the changed page and can be retrieved by issuing a read directory command.
Conventionally, the castout processes determine the oldest timestamp value in the directory by scanning the entire directory each time a castout process executes. This requires issuing numerous read directory commands to retrieve all of the UDF values in the cache, so that the oldest changed page can be identified. Since, the directory for a large cache may contain several million directory entries, significant overhead is realized in the coupling facility when the directory scan is executed. Further, a castout process needs to execute every few minutes, and thus, the overhead is exacerbated. It has even been necessary to purchase additional central processing unit (CPU) engines for the coupling facility to satisfy the scan.
Based on the foregoing, a need exists for a capability that enables the efficient determination of the age of the oldest changed page in a cache structure without requiring the need to scan the entire cache directory. A further need exists for a capability that manages changed data in a cache structure in a manner that does not degrade system performance. A yet further need exists for a capability that manages the changed data in a manner that enhances castout processing.