Content delivery networks (CDNs) have greatly improved the way content is transferred across data networks such as the Internet. One way a CDN accelerates the delivery of content is to reduce the distance that content travels in order to reach a destination. To do so, the CDN strategically locates surrogate origin servers, also referred to as caching servers or edge servers, at various points-of-presence (PoPs) that are geographically proximate to large numbers of end users and the CDN utilizes a traffic management system to route requests for content hosted by the CDN to the caching server that can optimally deliver the requested content to the requesting end user. As used hereafter optimal delivery of content refers to the most efficient available means with which content can be delivered from a server to an end user machine over a data network. Optimal delivery of content can be quantified in terms of latency, jitter, packet loss, distance, and overall end user experience.
Determination of the optimal caching server may be based on geographic proximity to the requesting end user as well as other factors such as load, capacity, and responsiveness of the caching servers. The optimal caching server delivers the requested content to the requesting end user in a manner that is more efficient than when origin servers of the content provider deliver the requested content. For example, a CDN may locate caching servers in Los Angeles, Dallas, and New York. These caching servers may cache content that is published by a particular content provider with an origin server in Miami. When a requesting end user in San Francisco submits a request for the published content, the CDN will deliver the content from the Los Angeles caching server on behalf of the content provider as opposed to the much greater distance that would be required when delivering the content from the origin server in Miami. In this manner, the CDN reduces the latency, jitter, and amount of buffering that is experienced by the requesting end user. The CDN also allows the content provider to offload infrastructure costs, configuration management, and maintenance to the CDN while being able to rapidly scale resources as needed. Content providers can therefore devote more time to the creation of content and less time to the creation of an infrastructure that delivers the created content to the end users. As a result of these and other benefits, many different CDNs are in operation today. Edgecast, Akamai, Limelight, and CDNetworks are some examples of operating CDNs.
A continual goal of any CDN is to improve the speed by which the CDN delivers content on behalf of its content provider customers. The obvious manner with which to improve CDN performance is by scaling the CDN resources. For example, the CDN can deploy additional PoPs to locate more edge servers closer to different groups of end users. Similarly, the CDN can deploy more expensive solid state disks (SSDs) in its caching servers instead of less expensive magnetic disk in order to improve responsiveness of its caching servers. However, the tradeoff for improved CDN performance by increasing resources is the increased cost associated with deploying said resources. Also, resource scaling provides diminishing returns on infrastructural investments once the CDN reaches a certain size.
To avoid these costs, CDN operators look for ways to derive improved performance out of already deployed resources. Content caching is a particular area of focus. For instance, improvements to how and what content is cached can directly translate to improved CDN performance. Such improvements can yield a more efficient usage of the CDN's finite cache, resulting in more content being served from cache as opposed to the greater delays that are associated with retrieving and serving content from a more distant origin server. Accordingly, there is a need for more efficient techniques with which to cache content at the CDN edge servers as well as to maintain, validate, and serve such cached content.