1. Field of Invention
The present invention relates generally to cache management, and more specifically to cache management using historical access information to determine which items to store in the cache.
2. Background of Invention
Many systems repeatedly access items from a set of stored data. The set of data items is commonly stored in persistent memory, such as on a magnetic drive. To speed up the access process, the accessing system often stores a subset of the data items in a cache in faster memory, such as random access memory, based upon the amount of space available therein.
Determining which items from the data set to store in the cache is a complicated problem. In the prior art, where the cache is full and the system accesses an uncached item, the system typically determines an existing item in the cache to overwrite with the accessed item. One prior art technique is to overwrite the least recently accessed cached item (this technique is known as LRU).
LRU appears to make sense on its face, but does not produce desirable results under all circumstances. Imagine a scenario in which a cache holds n items, and a system is repeatedly accessing a series of n+1 items in the order 1, 2 . . . n. This scenario could be, for example, a video player repeating a loop of n+1 frames. The player would access and cache items 1 through n, thereby filling the cache. The next item accessed would be n+1, which would be stored in the cache by overwriting item 1, the least recently accessed item in the cache. However, the next item the system would access after n+1 would be item 1, which would no longer be in the cache, and thus would have to be accessed from slow memory, and added to the cache by overwriting item 2. Because item 2 would then be needed, the system would have to retrieve it from slow memory, and so on ad infinitum, with the system never actually accessing an item from the cache. Of course this is the worst case scenario, but other less bad scenarios exist in which LRU still results in inefficient cache utilization.
In another prior art technique, the system overwrites the most recently utilized item in the cache (this is known as MRU). As one can see, this would avoid the worst case scenario for LRU described above, but can produce inefficient cache utilization under other circumstances. For example, suppose that the video player described above is replaying a frame x and its previous frame x−1 multiple times (e.g., during an editing session). By repeatedly, cyclically overwriting the most recently accessed frame x−1 with the currently accessed frame x and then overwriting most recently accessed frame x with currently accessed frame x−1, the system would never utilize the benefit of cache access, but instead always access x and x−1 from slow memory.
A more advanced prior art method determines whether all of the working data will fit in the cache, and utilizes LRU if so and MRU if not. This method, while better than either LRU or MRU on their own, still results in some inefficiencies and shortcomings of both methods under certain circumstances. Furthermore, the prior art techniques such as MRU and LRU, whether alone or in combination, do not take into account historical patterns of access requests over time when deciding which cache item to overwrite. Additionally, these methods only consider data in the cache, as opposed to all accessed data whether currently residing in the cache or not. As a result, the methods necessarily omit relevant information when managing a cache, and typically suffer in efficiency as a result.
What is needed are methods, systems and computer program products that utilize historical access information concerning historically accessed data items in order to robustly manage a cache.