Content caching is used by many networks and service providers to optimize the delivery of digital content. Copies of the same content are cached by different servers distributed throughout a network. When a request for the cached content comes in, the request can be routed to the optimal server that serves the requested content in the most expedient manner. The optimal server can be determined based on various criteria including geographic proximity to the user, load, available resources, etc.
Caching servers have a finite or limited amount of storage that can be used as cache. When the cache of a particular caching server is filled, already cached content is evicted from cache in order to make room for caching new content.
The caching server eviction operation significantly affects the caching server cache-hit ratio and performance. Evicting an already cached item and caching a new item in place of the evicted item involve write operations. These write operations significantly slow the ability of the caching server to respond to a request relative to reads associated with fulfilling requests to already cached items. Moreover, each time the caching server evicts an item that is subsequently requested, the caching server loses performance by having to retrieve the item from an origin server before the item can be served in response to the request, whereas if the item was retained in cache, the retrieval operation could be avoided. Accordingly, the caching servers are configured with cache replacement methodologies to manage the selection and timing for cached content eviction.
Least recently used (LRU) and least frequently used (LFU) are two common cache replacement methodologies. Under LRU, content is evicted from cache according to recency of requests. Under LFU, content is evicted from cache according to frequency of requests.
From a holistic perspective, these cache replacement methodologies appear to adequately manage the cache storage while being lightweight so as to not create a bottleneck during times of high demand. However, closer analysis reveals various inefficiencies with these prior art methodologies.
It is possible under these methodologies for a few content providers to disproportionately consume the cache at the expense of other content providers. Consequently, only a few content providers benefit from the efficiencies afforded by the cache, while other content providers receive little or no such benefit. These inefficiencies primarily arise because the prior art cache replacement methodologies treat content or content providers the same. As a result, a first content provider with ten times more content than a second content provider could have a tenfold greater cache footprint than the second content provider even when the content of the second content provider is served from the cache more often than the content of the first content provider. Similarly, a first content provider with large sized content could have a larger cache footprint than a second content provider with small sized content even when the small sized content is served from the cache more often than the large sized content.
Accordingly, there is a need to better manage how and what content is evicted from cache. To this end, there is a need for cache replacement methodologies that are not based and guided by a single factor or criterion. There is a further a need to provide differentiated and configurable access to the cache while preserving a lightweight and scalable implementation.