A caching server or proxy server is a machine operating within a network, such as the Internet, to retain copies of relevant content closer to users requesting the content. Typically, the caching server dynamically determines what content to cache by intercepting or otherwise receiving requests from the user. The caching server retrieves the content to be cached from one or more source servers that are located further away from the users serviced by the caching server. The caching server is able to accelerate delivery of the cached content to users because of its closer proximity to the users.
The amount of content that any caching server can locally cache is limited according to the server memory or disk. As the cache is filled, the caching server optimizes the cache to determine what content should remain in cache and what content should be replaced. The better the caching server is able to optimize its cache, the greater the percentage of requests it will be able to respond to directly from cache.
Content delivery networks (CDNs) deploy several such caching servers to different points-of-presence (PoPs). Each PoP accelerates content delivery to users in one or more geographic regions that are located closest to the PoP. The CDN scale allows it to cache terabytes of content at each of the regions, thereby providing accelerated content delivery for hundreds or thousands of content providers and content objects. However, this scale also amplifies the need for the CDN to optimize every aspect of caching. Any caching inefficiencies are multiplied across the entire CDN. For instance, a caching optimization that could provide one millisecond performance improvement will result in hours of improved performance across the CDN over time. Accordingly, there is need to optimize caching and improve caching performance.