1. Field of the Invention
This invention relates to computer systems, and particularly to systems and methods for multi-level exclusive caching using hints.
2. Description of Background
Caching can be used to hide the long latency of access to slower but relatively cheaper non-volatile storage, such as hard drives. Today, access to stored data is organized in a hierarchical fashion, with each layer provided with its own cache. For example, a particular software application may request data from a hard drive. However, the data may have to pass through multi-levels of caches until the data reaches the application. Most of the time, the caches in multiple levels are operationally unaware of the other caches in the hierarchy and thereby forgo the performance benefits that can be achieved by mutual collaboration between the multiple levels of caches.
As an illustration, a client-server configuration can include a storage server cache, a database server cache and a client cache. Both servers can each include cache management schemes that are independent and different from each other. In most cases, such independent cache management schemes result in lower levels of cache housing data elements that the higher levels (closer to the clients) already house. Such common elements are typically never accessed from the lower level as long as the higher level keeps a copy of the elements, and thus occupy cache space uselessly. In general, the more overlap in multiple caches, the more wasted cache space that results. The problem of overlapping caches is generally referred to the problem of inclusion. In addition, a miss from two caches, for example, costs at least an order of magnitude more than a hit in either of the levels individually because the access to the disks is extremely slow compared to network delays and memory accesses.
One example of a common read cache management algorithm is the Least Recently Used (LRU) replacement algorithm. This cache management algorithm works well in the scenario of a single level of read cache. However, in cases where there are more than one level of non-trivial amount of cache then the LRU algorithm can be wasteful. For example, for a two-level read cache scenario in which both levels have a cache of size ‘s’, the level closer to the client leverages temporal locality and provides cache hits as is expected from the LRU algorithm. The lower level of the two levels, however, typically has a very low amount of hits, as most of its contents are duplicates of data already present in the higher level. Thus, the lower level of cache performs sub-optimally. Caches are a very expensive component of today's servers and it is critical to avoid such wastage whenever it is possible. It is desirable to increase the aggregate hit ratio in the caches, as a hit in any cache is considerably fast as compared to disk response times, thereby reducing the average response time. In addition, it is desirable to increase the hit ratio in a higher-level cache, as a hit on a higher level is faster than a hit on a lower level, thereby decreasing the average response time.