Field
Embodiments of the present invention generally relate to cache management. In particular, embodiments of the present invention relate to a cache replacement policy that takes into consideration one or more factors relating to replacement cost of the data currently residing in the cache, including, but not limited to, the latency between the cache and the origin server, the processing time for the origin server to produce the data and the amount of time for the cache to create a cache entry and store the data.
The factors of the replacement cost may involve the data being currently returned (and under consideration for storage in the cache), data which already resides in the cache and is a candidate for replacement, or other data which exists in the cache and is not a candidate for replacement, but may influence the decision nonetheless.
Description of the Related Art
Cache memory plays an important role in performance of any computing/client device and/or network system. In a typical computing device, a cache memory is implemented to interface between processing units and the main memory (e.g., Random Access Memory (RAM)) to achieve faster access to data stored in the main memory. Similarly, in a network system/architecture, a computer network device, such as an Application Delivery Controller (ADC), that implements a Hypertext Transfer Protocol (HTTP) cache, can act as an interface between client devices that request static content that resides on one or more web servers. The HTTP cache stores previously requested static content so that a subsequent request for such content by the same or a different client can be served faster. In a network system environment, responsive to receiving a request for data from a client device, the network device first checks the cache system to determine whether the requested data is locally available within the cache memory. If the requested data is found within the cache memory (commonly referred to as cache hit), the client request can be serviced by the network device without contacting the origin server. Alternatively, if the requested data is not available in the cache (commonly referred to as a cache miss), the network device retrieves the data from the server and potentially also caches it locally. As there may be several clients requesting different data, an HTTP cache generally runs full, and hence needs to evict data associated with one or more cache entries to incorporate the newly requested data.
Various caching algorithms/policies, also referred to as replacement algorithms/policies, have evolved over time to determine which of the existing cached data should be retained and/or replaced with new data when the cache is full. Performance of a cache system is typically measured based on the hit rate of the cache, wherein the hit rate describes how often requested data is found in the cache. Another performance parameter for cache system is latency of the cache that describes how long after requesting desired data, the cache returns the data. Existing caching replacement policies reflect various tradeoffs between hit rate and latency.
Examples of existing caching algorithms include Belady's algorithm, Least Recent Used (LRU), Most Recent Used (MRU), Pseudo-LRU, Random Replacement, Segmented LRU, Least Frequent Used (LFU), and largest data item algorithm. When a request for data that is not available in a full cache is received, the data associated with a cache entry selected by one of these algorithms is replaced. For example, when a request for data that is not in the cache is received, the largest existing item can be selected for replacement. Alternatively, cached data that is least frequently used, oldest or most recently used (MRU) can be replaced.
All of these conventional caching algorithms maintain coherence at the granularity of a cache entry. However, as cache sizes have become larger, the efficacy of these caching algorithms has decreased. Inefficiencies have been created by storing large amounts of data, and by replacement of cached data that may take a long time to retrieve from a server.
Therefore, there is a need for systems and methods that provide cache performance improvements.