The invention relates to a method of managing a read cache for one or more direct access storage device while using a small amount of control storage in a manner that is less likely to impede write intensive workloads or workloads that lack locality of reference.
In a data processing system, instructions and associated data are transferred from storage devices to one or more processors for processing, and then resulting data generated by the processor is returned to storage devices. Thus, typical processing operations involve frequent and repetitive reading and writing from/to storage devices. As a result, storage access delays are often a primary limitation in the performance of a data processing system. Preferably, therefore, storage access speed should be maximized to maximize performance. However, often cost and other constraints require that the storage devices be comprised of relatively long access time circuitry, e.g., hard disk drives or other direct access storage devices (DASD""s). To overcome the resulting performance drawbacks, caches are typically used.
A cache typically includes a relatively small, but relatively high speed, bank of memory, that can be more rapidly accessed by the processor(s) than the storage device that the cache services. Caches have been used to increase the performance of DASD""s, and also to increase the performance of relatively low-speed solid state memory such as dynamic random access memory (DRAM).
Typically, a cache is associated with a cache directory, which stores an indication of those memory locations currently stored in the cache. Typically, a cache directory contains a number of entries, each entry identifying the address of data that is in the cache, and further identifying where the cache is currently storing that data. Thus, when a processor requests access to a particular address, the cache directory is accessed to determine whether data from that address is in the cache. If so, the requested data may be accessed in the cache, if appropriate. If the requested data is not in the cache, the requested data may be established in the cache, if appropriate.
The storage space on a hard disk or other DASD is typically arranged in arbitrarily sized data blocks. Recently, some computing systems, such as the AS/400 system available from the assignee of this application, have begun to utilize DASD""s having fixed-size storage blocks. In the typical system, however, the storage space on a mainframe DASD is arranged into tracks. The size of the tracks is a function of the particular DASD being used and is not standard. Data is stored in xe2x80x9crecordsxe2x80x9d on the track. The records are of arbitrary size, and a single track may include one or many records. As a consequence of the organization used in DASD""s, data in a DASD cache is also typically stored in arbitrary and non-standard size blocks. In some cases, the DASD cache will store all records in a track on the DASD, in which case the size of the data stored by the DASD is a function of the track size, and/or the size of the records on the track. In other cases, the DASD cache will store individual records, each replicating the data of a corresponding record on the DASD; in this case, because the size of the records is random, their size when stored in the cache is also random. In either case, there is variation in the size of the data stored by the cache, making it complex to manage the cache efficiently, and making it complex determine whether and where particular data is stored in the cache.
Caches have also been used to enhance the speed of solid-state memory, e.g., dynamic random access memory (DRAM). DRAM is typically arranged into pages or other fixed-sized blocks, and caches used with DRAM are typically organized into constant-size xe2x80x9clinesxe2x80x9d, which are relatively long sequences of sequential storage locations. When DRAM locations are duplicated into such a cache, typically the needed memory location as well as a few neighboring memory locations, are brought into a line of the cache.
There are two general types of caches in use today, write caches and read caches. A write cache is primarily intended to temporarily store data being written by the processor to a storage device. The processor writes data into the write cache, and thereafter the data is transferred or destaged from the write cache to the appropriate storage device. By caching data being written to the storage device, the efficiency of the write operations can often be improved. A read cache duplicates memory locations in the storage device, for the primary purpose of increasing memory read speed. Specifically, when a particular storage location being accessed by the processor is duplicated in the read cache, the processor may rapidly access the read cache instead of waiting for access to the storage device. Although a read cache is primarily intended for storing data being read from the storage device, the data in the read cache must be updated when the processor overwrites that data in the storage device. The need to rewrite data in a read cache under these circumstances can substantially diminish the performance of the read cache.
Caches have been managed in accordance with a least-recently-used LRU replacement scheme; specifically, when a data is to be added to the cache, old data which was least recently used, is replaced with the new data. While LRU is a popular replacement scheme, it is not necessarily the most efficient. Although not necessarily widely recognized by those skilled in the art, the inventors have determined that caches are most effective when managed such that data experiencing a high degree of locality of reference is maintained in the cache while data not experiencing locality of reference is not maintained in the cache. Furthermore, the inventors have determined that a read cache is most effective when data that is frequently overwritten is not stored in the cache. A read cache using an LRU replacement scheme will not necessarily meet these criteria, where there are repeated local references are spaced apart in time. In fact, under some circumstance a read cache will provide little or no performance improvement, and cannot be cost justified.
Compounding these problems, is the current lack of any effective approach to emulating the performance of a cache under real-life operating conditions. While there have in the past been software simulations of cache performance, such simulations have been performed by making assumptions as to the nature, frequency and kind of accesses that are made by the computer system, so that a model of the real-time behavior of the computer system and cache can be developed. If the assumptions as to the nature, frequency and kind of accesses are inaccurate, then the conclusions of the simulation are likely to be inaccurate.
As a result, at the present time the only way to make an accurate evaluation of the performance that can be achieved by a cache, is to actually install the cache and monitor the resulting performance. This means that new cache hardware must be purchased, at substantial expense, before it is known whether that hardware will actually provide a sufficient performance improvement to justify the associated expense. Furthermore, the expense is not limited to hardware cost. In a typical system, cache hardware can only be changed by downing the entire computer system; thus, there can be a substantial opportunity cost to installing new cache hardware, particularly in mission-critical computer systems such as high-capacity servers that are at the core of a business"" daily operations.
The invention addresses these and other difficulties through a low complexity approach to DASD cache management. Low complexity is the result of managing fixed-size bands of data from the DASD, e.g., of 256 kbytes, rather than variable size records or tracks. An important consequence of the low complexity, is that the memory consumed for cache management purposes is relatively low, e.g., only 2.5 Mbytes of control storage are needed to manage 8 Gbytes of cache memory.
The performance of the cache is further improved by collecting statistics for bands of data, as well as conventional LRU information, in order to improve upon the performance of a simple LRU replacement scheme.
To maintain low complexity, the statistics take the form of a single counter which is credited (increased) for each read to a band and penalized (reduced) for each write to a band. In the specific disclosed embodiment, the counter is limited to integer numbers between 0 and 100, and is credited by 6 for each read and penalized by 4 for each write. To improve efficiency, a band that has a statistics value of 40 or more is retained in the cache even if that band is the least recently used band; when a band is retained despite being the least recently used band, the band""s statistics counter is reduced by 8, and the band is made the most recently used band.
To further enhance performance, statistics and LRU information are also collected for bands of data that are not currently resident in the cache. By collecting statistics and LRU information for at least half as many nonresident bands as resident bands, there is a substantial improvement in decisions as to whether and when to bring bands of data into the cache. Specifically, a band must achieve a certain threshold of statistics before it will be made resident in the cache. In the particular disclosed embodiment, this threshold is a statistics counter having a value of 20 or more. In this embodiment, statistics and LRU information is collected for an equal number of resident and nonresident bands of data.
This cache management approach is further configured to, if desired, collect control information (e.g., statistics and LRU information) regarding potentially cacheable DASD data, even where there is no cache memory installed. When in this mode, the control information permits a real time emulation of performance enhancements that would be achieved were cache memory added to the computer system. This emulation has the substantial advantage that it is performed in real time and in response to the actual storage accesses produced by the computer system in practical use, rather than software simulations of the behavior of the computer system, which would usually be less accurate. Due to its low complexity and low control memory usage, the control storage overhead involved in such an emulation is acceptable.
Finally, this cache management approach includes features permitting dynamic reconfiguration of the cache size, so that cache memory may be added and removed in real time without requiring computer system downtime. This feature thus avoids the opportunity cost that was previously inherent in upgrading or changing the cache hardware of a computer system.
These and other advantages and features, which characterize the invention, are set forth in the claims annexed hereto and forming a further part hereof. However, for a better understanding of the invention, and the advantages and objectives attained by its use, reference should be made to the Drawing, and to the accompanying descriptive matter, in which there is described embodiments of the invention.