Caching is a fundamental technique in hiding delays in writing data to and reading data from storage, such as hard disk drive storage. These delays can be referred to as input/output (I/O) latency. Because caching is effective in hiding I/O latency, it is widely used in storage controllers, databases, file systems, and operating systems.
A cache thus may be defined as a high speed memory or storage device that is used to reduce the effective time required to read data from or write data to a lower speed memory or device. A modern storage controller cache typically contains volatile memory used as a read cache and a non-volatile memory used as a write cache. The effectiveness of a read cache depends upon its “hit” ratio, that is, the fraction of requests that are served from the cache without necessitating a disk trip (which represents a “miss” in finding data in the cache). The present invention is focused on improving the performance of a read cache, i.e., increasing the hit ratio or equivalently minimizing the miss ratio.
Typically, cache is managed in uniformly sized units called pages. So-called demand paging requires a page to be copied into cache from the slower memory (e.g., a disk) only in the event of a cache miss of the page, i.e., only if the page was required by the host and it could not be found in cache, necessitating a relatively slower disk access. In demand paging, cache management is relatively simple, and seeks to intelligently select a page from cache for replacement when the cache is full and a new page is to be stored in cache owing to a “miss”. One well-known policy simply replaces the page whose next access is farthest in the future with the new page. Another policy (least recently used, or LRU) replaces the least recently used page with the new page.
As recognized herein, in addition to demand paging, further improvement can be made in hiding I/O latency by speculatively prefetching or prestaging pages. Relatively complex algorithms have been introduced which attempt to predict when a page will be needed, but commercial systems have rarely used very sophisticated prediction schemes, because sophisticated prediction schemes require an extensive history to be kept of page accesses. This is cumbersome and expensive. Furthermore, to be effective a prefetch must complete before the predicted request, requiring sufficient prior notice that may not be feasible to attain. Also, long-term predictive accuracy may be low to begin with and can become worse with interleaving of a large number of different workloads. Finally, for a disk subsystem operating near its peak capacity, average response time increases drastically with the increasing number of disk fetches, and, hence, low accuracy predictive prefetching which results in an increased number of disk fetches can in fact worsen the performance.
Accordingly, the present invention understands that a simpler approach to speculative prefetching can be employed that uses the principle of sequentiality, which is a characteristic of demanded data (data to be read) in which consecutively numbered pages in ascending order without gaps are often required. Sequential file access arises in many contexts, including video-on-demand, database scans, copy, backup, and recovery. In contrast to sophisticated forecasting methods, as understood herein detecting sequentiality is easy, requiring very little history information, and can attain nearly 100% predictive accuracy.
However, while seemingly simple, a good sequential prefetching algorithm and associated cache replacement policy, as critically recognized herein, is surprisingly difficult to achieve. To understand why, it must first be understood that in sequential prefetching, synchronous prefetching (bringing into cache sequential pages to a missed page) may be used initially, and after this bootstrapping stage, asynchronous prefetching (bringing into cache pages that are sequential to a demanded “trigger” page that was “hit”, i.e., found in cache) is used. Prefetching and caching thus are intertwined, and one policy for cache management when prefetch is used is the above-mentioned LRU in which two lists, one listing sequential pages and one listing random access pages, are maintained according to recency of access. In the context of sequential prefetching, when tracks are prefetched or accessed, they are placed at the most recently used (MRU) end of the sequential list, while for cache replacement, tracks are evicted from the LRU end of the list.
With the above background in mind, the present invention critically observes that when synchronous and asynchronous prefetching strategies are used along with the LRU-based caching, and an asynchronous trigger track is accessed, an asynchronous prefetch of the next group of tracks occurs. In an LRU-based cache, these newly fetched group of tracks along with the asynchronous trigger track are placed at the MRU end of the list, with the unaccessed tracks within the current prefetch group remaining where they were in the LRU list, hence, potentially near the LRU end of the list. These unaccessed tracks within the current prefetch group can be accessed before the tracks in the newly prefetched group, so that, depending upon the amount of cache space available for sequential data, it can happen that some of these unaccessed tracks may be evicted from the cache before they are accessed, resulting in a sequential miss. Furthermore, the present invention understands that this may happen repeatedly, thus defeating the purpose of employing asynchronous prefetching.
Hence, when LRU-based caching is used along with the above prefetching strategy, the resulting algorithm can violate the so-called stack property, and as a result, when the amount of cache space given to sequentially prefetched data increases, sequential misses do not necessarily decrease. As understood herein, the “stack” property can be a crucial ingredient in proper cache management. As further understood herein, at the cost of increasing sequential misses, both of the above problems can be hidden if (i) only synchronous prefetching is used or (ii) if both synchronous and asynchronous prefetching are used, setting the asynchronous trigger to always be the last track in a prefetched group, but of course the first approach amounts to foregoing all potential benefits of asynchronous prefetching, while the second approach can result in a sequential miss if the track being prefetched is accessed before it is in the cache. In view of the above problems, one purpose of the present invention is to avoid violation of the stack property without incurring additional sequential misses. More generally, the present invention represents a significant improvement in cache management when sequential prefetch is used.