1. Field of the Invention
The present invention generally relates to cache operations within computer systems, and more particularly to an adaptive cache replacement technique with enhanced temporal filtering in a demand paging environment.
2. Description of the Related Art
Caching is a fundamental problem in computer science. Modem computational infrastructure designs are rich in examples of memory hierarchies where a fast, but expensive main (“cache”) memory is placed in front of an inexpensive, but slow auxiliary memory. Caching algorithms manage the contents of the cache so as to improve the overall performance. In particular, cache algorithms are of tremendous interest in databases, virtual memory management, and storage systems, etc., where the cache is RAM and the auxiliary memory is a disk subsystem.
For simplicity, it is assumed that both the cache and the auxiliary memory are managed in discrete, uniformly-sized units called “pages”. If a requested page is present in the cache, then it can be served quickly resulting in a “cache hit”. On the other hand, if a requested page is not present in the cache, then it must be retrieved from the auxiliary memory resulting in a “cache miss”. Usually, latency on a cache miss is significantly higher than that on a cache hit. Hence, caching algorithms focus on improving the hit ratio. Historically, the assumption of “demand paging” has been used to study cache algorithms. Under demand paging, a page is retrieved from the auxiliary memory to the cache only on a cache miss. In other words, demand paging precludes speculatively pre-fetching pages. Under demand paging, the only question of interest is: When the cache is full, and a new page must be inserted in the cache, which page should be replaced?
Digital microprocessors use cache memory to hold data likely to be needed in the near future. Cache memory is comparatively fast and is a local memory. Caching usually occurs when data or other instructions are retrieved from the main memory to be used by the microprocessor, they are also stored in the cache. Typically, the cache is constructed from a random access, read/write memory block (RAM), which can access a single stored object, referred to a line, in a single processor cycle. Preferably, the cache size matches the processor cycle time and is read or written during a given cycle. A server can be configured to receive a stream of requests from clients in a network system to read from or write to a disk drive in the server. These requests form the “workload” for the server.
Each line in the cache memory contains the data being saved and the address of the data in the main memory (the tag). An example of a simple cache 210 is illustrated in the block diagram of FIG. 1. When the microprocessor makes a reference to the main memory, a part of the reference address, referred to as the index, accesses a single line stored in the cache RAM 212. A “hit” occurs if the tag of the accessed line in the cache 210 matches the reference address of the referenced data. When this happens the cache RAM 212 immediately supplies the line to the microprocessor. However, a “miss” occurs if the tag of the accessed line in the cache 210 does not match the reference address of the referenced data. When this happens the address is sent to the main memory to retrieve the requested line. When the main memory sends the line to the microprocessor, it is written into the cache RAM 212 using the same index as the original look-up, along with its tag. However, because the main memory is much slower than the microprocessor, a delay occurs during this retrieval process.
Additionally, cache memory is used when data is written from a host computer to a long-term data storage device such as a disk drive. Here, data may be written to cache memory in which it is temporarily held with an indication that the data must be written to longer term data storage when the data storage system is able to perform this write operation. When cache memory is used to temporarily delay write pending data, memory storage locations are removed from the main memory locations generally available to the data storage system in which data may be held pending use by the host.
Traditionally, under the assumption of demand paging, a cache technique termed the least recently used (LRU) has been used. When the cache is full, and a page must be demoted to make space for a new page, LRU removes the least recently used page from the cache. The technique LRU is simple to implement, has low space and time overhead, and it captures “clustered locality of reference” or “recency” property of workloads. However, LRU has two main disadvantages: (i) it does not capture pages with “high frequency” or “long-term-utility” and (ii) it is not resistant to scans which are a sequence of one-time-use-only read/write requests.
Recently, under the assumption of demand paging, a cache technique termed the Adaptive Replacement Cache (ARC) has been used (Nimrod Megiddo and D. S. Modha, ARC: A Self-tuning, Low Overhead Replacement Cache, Proc. 2nd USENIX Conference on File and Storage Technologies (FAST 03), San Francisco, Calif., 115–130, 2003), the complete disclosure of which is herein incorporated by reference. Comparatively, this caching technique has low computational overhead similar to LRU updating schemes, its space overhead over LRU is negligible, it outperforms LRU for a wide range of workloads and cache sizes, it is self-tuning in that for every workload it dynamically adapts between recency and frequency to increase the hit ratio, and it is scan-resistant, and, hence, avoids cache pollution due to sequential workloads.
The basic idea behind ARC is that the cache is managed in uniform-sized chunks called “pages”. Assuming that the cache can hold c pages, the technique ARC maintains a cache directory that contains 2c pages—c pages in the cache and c history pages. The cache directory of ARC, which is referred to as DBL (database load), maintains two lists: L1 and L2. The first list contains pages that have been seen only once recently, while the latter contains pages that have been seen at least twice recently. The replacement technique for managing DBL is: Replace the LRU page in L1, if |L1|=c; otherwise, replace the LRU page in L2. The ARC technique builds on DBL by carefully selecting c pages from the 2c pages in DBL. The basic idea is to divide L1 into top T1 and bottom B1 and to divide L2 into top T2 and bottom B2. The pages in T1 (resp. T2) are more recent than those in B1. (resp. B2). The algorithm sets a target size p for the list T1. The replacement technique is as follows: Replace the LRU page in T1, if |T1|≧p; otherwise, replace the LRU page in T2. The adaptation comes from the fact that the target size p is continuously varied in response to an observed workload. The adaptation rule is as follows: Increase p, if a hit in the history B1 is observed; similarly, decrease p, if a hit in the history B1 is observed.
However, a limitation of ARC is that whenever it observes a hit on a page in L1=T1∪B1, it immediately promotes the page to L2=T2∪B2 because the page has now been recently seen twice. At upper level of memory hierarchy, ARC observes two or more successive references to the same page fairly quickly. Such quick successive hits are known as “correlated references” and are not a guarantee of long-term utility of a page, and, hence, such pages pollute L2, thus reducing system performance. Therefore, there is a need to create a temporal filter that imposes a more stringent test for promotion from L1 to L2. Such a temporal filter is of extreme importance in upper levels of memory hierarchy such as file systems, virtual memory, databases, etc.
The below-referenced U.S. patents disclose embodiments that were satisfactory for the purposes for which they were intended. The disclosures of the below-referenced prior U.S. patents, in their entireties, are hereby expressly incorporated by reference into the present invention for purposes including, but not limited to, indicating the background of the present invention and illustrating the state of the art.
U.S. Pat. No. 5,649,156 issued to Vishlitzky et al. discloses a caching method, which determines whether data should be maintained in short term cache depending on how often it has been accessed. U.S. Pat. No. 6,078,995 issued to Bewick et al. discloses a cache memory system, which uses bits to store whether data has been recently accessed. U.S. patent Publication No. 2003/0105926 discloses a cache memory system which handles/adapts to a variable workload. However, as mentioned a novel adaptive cache replacement technique is needed having enhanced temporal filtering capabilities.