1. The Field of the Invention
The present invention relates to cache management in data storage systems. More specifically, the present invention relates to prefetch scheduling in a preexisting LRU cache of a data storage system.
2. The Relevant Art
Cache memory is used in data storage systems to buffer frequently accessed data in order to allow the data to be accessed at a relatively high rate. The cache memory is a relatively small high speed memory operating on a host processor or between the host processor and relatively slower memory devices. Typical data storage systems using caching may include a cache directory or index of the data elements in a main memory of the hosts operating on the data storage system. The cache directory is referenced to provide an indication of whether or not each data element of the main memory resides in the cache memory at any give time, and if so, to indicate the present location of the data element in the cache memory. When a host processor requests an Input/Output (I/O) operation, the cache directory is first consulted to determine whether the requested data element is present in the cache memory and if so, to determine its location. When the data element is present in the cache memory, the data element can be quickly accessed, rather than having to be requested from a slower storage device.
Generally, in such systems, every time a data element is requested, a determination is made whether the accessed data is likely to be accessed again in the near future. If so, the accessed data element is copied or “staged” into the cache memory. In some data storage systems, requested data elements are always staged into the cache memory if they are absent from the cache memory. Some data storage systems are also responsive to explicit “prefetch” commands from the host computer to cause specified data to be staged into the cache, even though the specified data is not immediately accessed by the host computer.
Because the cache memory has a capacity that is smaller than the main memory, it is frequently necessary for data elements in the cache memory to be replaced or removed from the cache memory in order to provide space in the cache memory for more recently requested data elements. In general, for the cache memory to be useful, the data elements removed or replaced from the cache memory must be calculated to be less likely to be accessed in the near future than the new data elements being staged into the cache memory at the time the removal or replacement occurs.
Data storage systems that use disk drives for the main memory typically use random access memory (RAM) for the cache memory. In such a data storage system, the data elements in the cache memory are often logical tracks of data on the disks, although in many systems, the data records are blocks or records of data. The cache directory includes a directory entry for at least each data element stored in the cache. Each directory entry for each data element stored in the cache memory generally includes a pointer to the location of the data element in the cache memory. The cache directory can be a table including an entry for each data element stored in the disk storage. Alternatively, the directory may include a hash table for accessing lists of the directory entries so that the cache directory need not include any cache directory entries for data elements that are absent from the cache memory. In either case, any one of a plurality of data elements in the cache memory may be replaced or removed from the cache according to the particular cache management scheme being used to make room for another data element.
The performance of such a data storage system is highly dependent on the cache management scheme used for selecting the data element to be removed or replaced. The cache management scheme is implemented by a cache management system, or “cache manager,” in the data storage system.
In one common cache management scheme, a cache manager is programmed to remove or replace the “least-recently-used” (LRU) data element in the cache memory. The least-recently-used data element is usually the data element accessed least recently by the host computer. The cache manager maintains an ordered list, or queue, of the data elements in the cache memory so that the cache manager can readily identify the least-recently-used data element. The queue is typically maintained in a doubly-linked list. When a data element is accessed, the data element is moved to the head of the queue, unless the data element is already at the head of the queue. This process is known as making the data element “young” in the cache and ensures that, when the queue is not empty, the least-recently-used data element in the cache memory will be located at the end of the queue and the most-recently-used element in the cache memory will be located at the head of the queue.
Data that is fetched into the cache memory may be described in two broad fashions. The first is random access data which denotes data that is needed for specific operations but which is not connected with other data in any manner. Many caching systems are configured for optimal performance when fetching random access data. The second type of data is known as sequential data, denoting that several elements of the data are used by a processor in a specific sequence, typically the sequence in which the data elements are stored on a storage device. Many systems that employ a dedicated or “native” LRU cache are only designed to store data accessed in random access operations and make no provision for accessing sequential data.
Attempts have been made to improve the performance of a native LRU cache when fetching sequential data. These solutions, however, require a modification in some fashion of the LRU cache itself to achieve satisfactory performance for sequential data prefetches. One example of these types of modifications is the creation of a “microcache” within the existing LRU cache to hold the sequential data.
Modifying existing LRU caches is not always a plausible solution. For instance, in legacy systems it may not be possible or desirable to modify the replacement algorithm of the cache. The cache logic or controller may be inaccessible or hardwired, or the system the cache resides on may have been provided for a specific purpose that modifying the cache would disrupt.
Accordingly, a need exists in the art for a method of scheduling prefetches of sequential data into a native LRU cache without directly modifying the algorithm or structure of the LRU cache.