Modern data processing systems typically comprise a host computer, consisting of an arithmetic and logic unit and a main memory unit for containment of data and instructions presently being processed, and mass storage means for storage of data and processing instructions at other times. The mass storage means is typically connected to the host computer by means of a channel. When the host desires a particular data set or record, it issues a command over the channel to the mass storage means, which then reads the data, from whatever medium it is stored upon, e.g., magnetic disk or tape memory media, over the channel into the main memory of the host. The substantial length of time required to retrieve data from long term storage limits the throughput or usage of the host computer. To minimize this loss of use of the host computer, the host will typically issue a series of requests for data and then perform other tasks while the data is being retrieved from long term disk or tape media. However, even when this “queuing” is performed there is substantial host computer computation time lost due to the time required for accessing data.
Many computer systems use a variety of different memory or data storage devices arranged in a hierarchy. For example, each processor of the computer system has dedicated registers to hold relatively small amounts of data which is frequently and rapidly accessed during processing. In addition random access memory (RAM) is also provided to hold greater amounts of information which can be accessed on a somewhat slower but nonetheless relatively rapid basis. Cache memory is used to hold even greater amounts of data which is accessed less frequently but which nonetheless must be rapidly accessed to avoid significant restrictions in the performance of the computer system. Main memory is employed to hold massive amounts of data, any particular part of which is typically accessed infrequently.
Access time for a memory refers to the amount of time for the processor to gain access to the memory in response to an input request to receive or read data from the memory, or to gain access to the memory in response to an output request to record or write data into the memory. In general, access time is that time which occurs after an input/output (I/O) request and before a read/write operation is accomplished. The amount of access time of a computer system is dependent upon the inherent speed characteristics of the memory device itself, and the ability of the system as a whole to accommodate the I/O request. To increase the amount of data processing, it is important to minimize the access time. Increased access times result in greater time periods of inactivity from the computer system, thereby decreasing its performance.
The hierarchy of memory devices is intended to reduce access times and improve computer system performance by minimizing the non-productive times when the processor is waiting to read or write data. Because the registers associated with the processors are written to and read from frequently and continually during processing, the registers are typically solid state devices which have very quick access times comparable to the clock or cycle times of the processor. The RAM which is also solid state memory provides greater data holding capacity and still obtains relatively quick access times. Cache memory typically has a much higher capacity than the RAM but has slower access times. The cache memory is typically implemented larger amounts of slower solid state memory. The main memory may be one or more mass storage disk drives, tape reel devices, a library of tape cartridges and other types of extremely high capacity mass storage devices.
In general, as the capacity of the memory increases the access time also increases. It is therefore important to attempt to move the data which is more likely to be needed for a particular processing operation up the hierarchy of memory, to make that data more rapidly available in less access time when it is needed for a processing operation. In general, higher performance computer systems use memory management control processors associated with cache and main memory to process I/O requests and transfer data from the main memory to the cache memory, so that the transferred data will be more quickly available for processing.
Because of the reduced access time of the cache memory, as compared to the main memory, the overall performance of the computer system is greatly enhanced if all I/O requests may be satisfied from cache memory. Each successful satisfaction of an I/O request from cache memory is sometimes referred to as a “hit”. When it is not possible to satisfy an I/O request through the cache memory, further processing by the host computer is stopped or “blocked”. A blocked I/O request results in a system “disconnect,” during which time the cache memory is disconnected from the processor. A system disconnect is required to read the requested information from the main memory and to write it to the cache memory. A system disconnect also occurs when previously recorded data in the cache memory is eliminated or discarded by freeing space from the cache memory in order to accommodate an output request from the processor. A disconnect can account for hundreds of milliseconds of time delays while the demand for data not presently contained in the cache memory or the demand for free space not presently contained in the cache memory is resolved.
Data caching as part of mass storage devices is a well known technique for eliminating delays in memory access due to mechanical limitations of a storage device. For example, in the case of a disk drive, plural disks rotate at a fixed speed past read/write heads which may either be stationary with respect to the disk or move radically back and forth with respect to the disk in order to juxtapose the heads to various portions of the disk surfaces. In either case, there is a finite average time (access time) required for a particular data record to be located and read from the disk. This “access” time includes the time for a head to move to the correct cylinder (seek time) and the time required (or latency) for the disk to rotate with respect to the head until the beginning of the particular record sought is juxtaposed to the head for reading and writing.
Cache data storage eliminates these inherent delays by storing records in frequently accessed tracks in a high speed system memory (e.g., solid-state RAM). The idea is simply to allow as many memory accesses as possible to immediately retrieve data from the high speed system memory rather than wait for the data to be transferred (or staged) from the slower disk storage device to the high speed system memory. To accomplish this task, data may be staged into the high speed system memory before data access is required (i.e., pre-staged).
Clearly, the effectiveness of the cache data storage system is limited by the system's ability to anticipate the needs of future memory accesses and transfer those data records from disk storage to the high speed system memory prior to the memory access. If a sequence of memory accesses is random in nature, the cache data storage system cannot anticipate future memory accesses. Accordingly, one method of anticipating future memory accesses is to identify sequential or near sequential memory accesses. Once a sequential or near sequential access is identified, future records/tracks in the sequence can be immediately pre-staged into the high speed system memory in advance of future memory accesses.
Since the memory subsystem utilized for cache buffers has a smaller capacity than the total capacity of the mass storage system, the memory subsystem is managed by a local CPU which attempts to keep the most recently accessed data in the cache buffers. When the cache buffers become filled, older data in the cache buffers must be discarded to make room for newer, more recently accessed, data to be stored in cache buffers. To make room for new data in the cache buffers, the local CPU of the memory subsystem of prior designs locates the least recently referenced (typically referred to as least recently used or LRU) cache buffer and discards it. New data, more recently referenced is then placed in the vacated cache buffers.
Prior methods used to locate the LRU cache buffer maintain various linked list data structures, one data structure per cache buffer in the memory subsystem. As each cache buffer is referenced by a request from a host computer system, that data structure is unlinked from the linked list in its current position and relinked to the top of the linked list. Over time, these methods migrate the more recently used cache buffers toward the top of the list and the least recently used cache buffers toward the bottom of the list. Some prior methods have maintained a doubly linked list to reduce the processing time required for moving a data structure from its current position in the list to the top of the linked list. All of these methods for trying to predict which data located on mass storage devices will be requested by a host computer fail to effectively predict the location of this data under different circumstances. While each method is effective under some circumstances, all methods are likely to fail under some other set of data processing environments.
It is against this background of information that the improvements in managing the use of cache memory in a computer system according to the present invention have evolved.