Most high capacity, high performance computer systems use a variety of different memory or data storage devices arranged in a hierarchy. For example, each processor of the computer system has dedicated registers to hold relatively small amounts of data which is frequently and rapidly accessed during processing. In addition random access memory (RAM) is also provided to hold greater amounts of information which can be accessed on a somewhat slower but nonetheless relatively rapid basis. Cache memory is used to hold even greater amounts of data which is accessed less frequently but which nonetheless must be rapidly accessed to avoid significant restrictions in the performance of the computer system. Main memory is employed to hold massive amounts of data, any particular part of which is typically accessed infrequently.
Access time for a memory refers to the amount of time for the processor to gain access to the memory in response to an input request to receive or read data from the memory, or to gain access to the memory in response to an output request to record or write data into the memory. In general, access time is that time which occurs after an input/output (I/O) request and before a read/write operation is accomplished. The amount of access time of a computer system is dependent upon the inherent speed characteristics of the memory device itself, and the ability of the system as a whole to accommodate the I/O request. To increase the amount of data processing, it is important to minimize the access time. Increased access times result in greater time periods of inactivity from the computer system, thereby decreasing its performance.
The hierarchy of memory devices is intended to reduce access times and improve computer system performance by minimizing the non-productive times when the processor is waiting to read or write data. Because the registers associated with the processors are written to and read from frequently and continually during processing, the registers are typically solid state devices which have very quick access times comparable to the clock or cycle times of the processor. The RAM which is also solid state memory provides greater data holding capacity and still obtains relatively quick access times. Cache memory typically has a much higher capacity than the RAM but has slower access times. The cache memory is typically implemented larger amounts of slower solid state memory. The main memory may be one or more mass storage disk drives, tape reel devices, a library of tape cartridges and other types of extremely high capacity mass storage devices.
In general, as the capacity of the memory increases the access time also increases. It is therefore important to attempt to move the data which is more likely to be needed for a particular processing operation up the hierarchy of memory, to make that data more rapidly available in less access time when it is needed for a processing operation. In general, higher performance computer systems use memory management control processors associated with cache and main memory to process I/O requests and transfer data from the main memory to the cache memory, so that the transferred data will be more quickly available for processing.
Because of the reduced access time of the cache memory, as compared to the main memory, the overall performance of the computer system is greatly enhanced if all I/O requests may be satisfied from cache memory. Each successful satisfaction of an I/O request is sometimes referred to as a "hit". When it is not possible to satisfy an I/O request through the cache memory, further processing by the host computer is stopped or "blocked". A blocked I/O request results in a system "disconnect," during which time the cache memory is disconnected from the processor. A system disconnect is required to read the requested information from the main memory and to write it to the cache memory. A system disconnect also occurs when previously recorded data in the cache memory is eliminated or discarded by freeing space from the cache memory in order to accommodate an output request from the processor. A disconnect can account for hundreds of milliseconds of time delays while the demand for data not presently contained in the cache memory or the demand for free space not presently contained in the cache memory is resolved.
The typical approach employed to attempt to maximize the amount of cache memory space available to accommodate output requests and to minimize the number of blocks or blockages and disconnects in the response to input requests, is to establish fixed low and high thresholds which control the amount of free space available within the cache memory. The free space is that amount of space which is available in the cache memory in which to write data. The fixed low threshold triggers the release of cache memory space, typically on a least recently used basis. A lower threshold of free space means less space is available in the cache memory and higher proportion of the cache memory is used. For example, the low threshold of free space may translate into 98% of the available cache memory space is used, and a high threshold of free space may translate into 90% of the available cache memory space is used. When the amount of free space drops below the low threshold, cache space will be discarded from the cache memory, usually on a least recently used basis. Cache space will be discarded until the high threshold is reached, at which point no further cache space will be released. Releasing cache space means freeing tracks of a rotating disk cache memory to be written to or freeing memory relocations in a solid state cache memory to be written to.
While the use of high and low thresholds to control the release of cache memory space and thereby establish desired amounts of cache free space are effective, the high and low threshold approach is insensitive to workload demands. At best, the high and low thresholds represent a compromise between the size of the working set of data within the cache memory and the frequency of blocks due to an out-of-free space condition.
The out-of-free space condition which results in a blockage occurs when the workload on the cache memory has outpaced the ability of the cache memory controller to increase the amount of free space. The inability to increase the amount of free space may occur because of a relatively large number of I/O requests which must be satisfied on a priority basis compared to free space management processing within the cache memory itself. As a result, free space may not be released sufficiently fast to prevent output requests from totally consuming all of the cache memory space.
In order to anticipate the eventuality that frequent I/O requests will inhibit the release of sufficient cache space to blocks in response to output requests, the low threshold be raised to a smaller percent of the overall cache memory available, for example raised from 98% to 95%. By raising the low threshold, free space will be discarded earlier to reduce the incidence of output request blocks.
Raising the low threshold creates a difficulty when the workload is not as demanding. The larger low threshold will under-utilize the cache memory, since a larger amount of free space in the cache memory is available. Freeing excess cache memory space increases the probability that the data previously available in the free space will be requested by the host processor in an input request, but that data will be unavailable as a result of the free space having been released from the cache memory. The inability of the host processor to access the data which has been unnecessarily freed from the cache memory also results in blockages and the attendant delays from the following system disconnect while the desired data is obtained from the main memory.
It is against this background of information that the improvements in managing the use of cache memory in a computer system according to the present invention have evolved.