1. Field
The present description relates to a method, system, and program product for prestaging data into cache from a storage system in preparation for data transfer operations.
2. Description of Related Art
In current storage systems, a storage controller manages data transfer operations between a direct access storage device (DASD), which may be a string of hard disk drives or other non-volatile storage devices and host systems and their application programs. To execute a read operation presented from a host system, i.e., a data access request (DAR), the storage controller physically accesses the data stored in tracks in the DASD. A DAR is a set of contiguous data sets, such as tracks, records, fixed blocks, or any other grouping of data. The process of physically rotating the disk in the DASD to the requested track then physically moving the reading unit to the disk section to read data is often a time consuming process. For this reason, current systems frequently stage data into a cache memory of the storage controller in advance of the host requests for such data.
In addition, storage systems may also include solid state storage devices such as flash memory. However, flash memory tends to be substantially more expensive than DASD storage.
A storage controller typically includes a large buffer managed as a cache to buffer data accessed from the attached storage. In this way, data access requests (DARs) can be serviced at electronic speeds directly from the cache thereby avoiding the delays caused by reading from storage such as DASD storage. Prior to receiving the actual DAR, the storage controller receives information on a sequence of tracks in the DASD involved in the upcoming read operation or that data is being accessed sequentially. The storage controller will then proceed to stage the sequential tracks into the cache. The storage controller would then process DARs by accessing the staged data in the cache. In this manner, the storage controller can return cached data to a read request at the data transfer speed in the storage controller channels as opposed to non-cached data which is transferred at the speed of the DASD device or other storage from which the data was transferred to the cache.
During a sequential read operation, an application program, such as a batch program, will process numerous data records stored at contiguous locations in the storage device. It is desirable during such sequential read operations to prestage the sequential data into cache in anticipation of the requests from the application program. As a result of prestaging, often one of the tracks being staged is already in cache. Present techniques used to prestage sequential blocks of data include sequential caching algorithms systems, such as those described in the commonly assigned patent entitled “CACHE DASD Sequential Staging and Method,” having U.S. Pat. No. 5,426,761. A sequential caching algorithm detects when a device is requesting data as part of a sequential access operation. Upon making such a detection, the storage controller will begin prestaging sequential data records following the last requested data record into cache in anticipation of future sequential accesses.
U.S. Pat. No. 5,426,761 discloses an algorithm for sequential read operations to prestage multiple tracks into cache when the tracks are in the extent range of the sequential read operation, the track is not already in the cache, and a maximum number of tracks have not been prestaged into cache. The cached records may then be returned to the application performing the sequential data operations at speeds substantially faster than retrieving the records from a non-volatile storage device.
Another prestaging technique includes specifying a block of contiguous data records to prestage into cache in anticipation of a sequential data request. For instance, the Small Computer System Interface (SCSI) provides a prestage or prefetch command, PRE-FETCH, that specifies a logical block address where the prestaging operation begins and a transfer length of contiguous logical blocks of data to transfer to cache. The SCSI PRE-FETCH command is described in the publication “Information Technology-Small Computer System Interface-2,” published by ANSI on Apr. 19, 1996, reference no. X3.131-199x, Revision 10L, which publication is incorporated herein by reference in its entirety.
Yet another prestaging technique is referred to as AMP (Adaptive Multi-stream Prefetching) in a shared cache. This technique seeks to provide an algorithm which can adapt both the amount of data being prefetched in a prefetch operation, and the value of a trigger for triggering the prefetch operation, on a per stream basis in response to evolving workloads.
As set forth above, when a host reads tracks sequentially, it benefits greatly to pre-stage tracks so that the tracks are already in cache before the host requests access to those tracks. However, a storage controller has a limited Cache and limited bandwidth. If pre-stage stages too few tracks then the host will get more cache misses since fewer tracks will be in cache. Conversely, if the host pre-stages too many tracks then it can starve other kinds of I/O in the storage controller.
Tracks staged into cache may be demoted (dropped from cache) according to a Least Recently Used (LRU) algorithm to insure that the staged data in cache does not exceed a predetermined threshold. The LRU algorithm discards the “least recently used” items first, which are generally the items which have been in the cache the longest without being used.
The data when first placed into cache may be designated (“time stamped”) MRU (“most recently used”) data. As the data remains unused in cache, the timestamp for that data is updated. The longer the data stays in cache unused, the closer the data moves toward an LRU timestamp. The LRU algorithm assumes that the oldest unused data is the data least likely to be used and therefore may be discarded from the cache to make room in the cache for additional data which is more likely to be accessed. In some LRU algorithms, LRU data may be redesignated with an MRU timestamp to give that data another chance to be used before being demoted, that is, discarded from cache, should the timestamp for that data again arrive at LRU status.