Content delivery networks (CDNs) have greatly improved the way content is transferred across data networks such as the Internet. One way a CDN accelerates the delivery of content is to reduce the distance that content travels in order to reach a destination. To do so, the CDN strategically locates surrogate origin servers, also referred to as caching servers or edge servers, at various points-of-presence (PoPs) that are geographically proximate to large numbers of end users and the CDN utilizes a traffic management system to route requests for content hosted by the CDN to the caching server that can optimally deliver the requested content to the requesting end user. As used hereafter optimal delivery of content refers to the most efficient available means with which content can be delivered from a server to an end user machine over a data network. Optimal delivery of content can be quantified in terms of latency, jitter, packet loss, distance, and overall end user experience.
Determination of the optimal caching server may be based on geographic proximity to the requesting end user as well as other factors such as load, capacity, and responsiveness of the caching servers. The optimal caching server delivers the requested content to the requesting end user in a manner that is more efficient than when origin servers of the content provider deliver the requested content. For example, a CDN may locate caching servers in Los Angeles, Dallas, and New York. These caching servers may cache content that is published by a particular content provider with an origin server in Miami. When a requesting end user in San Francisco submits a request for the published content, the CDN will deliver the content from the Los Angeles caching server on behalf of the content provider as opposed to the much greater distance that would be required when delivering the content from the origin server in Miami. In this manner, the CDN reduces the latency, jitter, and amount of buffering that is experienced by the requesting end user. The CDN also allows the content provider to offload infrastructure costs, configuration management, and maintenance to the CDN while being able to rapidly scale resources as needed. Content providers can therefore devote more time to the creation of content and less time to the creation of an infrastructure that delivers the created content to the end users. As a result of these and other benefits, many different CDNs are in operation today. Edgecast, Akamai, Limelight, and CDNetworks are some examples of operating CDNs.
CDNs differentiate themselves on the basis of cost and performance. One area in which CDNs strive to improve in terms of cost and performance is caching. However, it is often the case that improved caching performance begets increased costs. For example, a CDN can deploy additional storage to each of its caching servers at added cost in order to increase the amount of available cache at each of its caching servers. Similarly, the CDN can deploy more expensive solid state disks (SSDs) in its caching servers instead of cheaper magnetic disk in order to improve responsiveness of its caching servers.
To avoid these tradeoffs in cost and performance, CDNs and other cache operators are continually in search of new caching techniques, devices, etc. that improve caching performance without added cost. One such area of focus is the efficiency with which existing caching servers cache content.
CDNs typically utilize first hit caching to determine when to cache content. First hit caching has been preferred because of its simplicity and relative good performance. When performing first hit caching, a caching server will retrieve requested content from an origin, pass the retrieved content to the requesting end user, and store the content to local cache when the content is requested for the first time. The next time that content is requested, the caching server will retrieve and serve the content from its local cache rather than from the origin.
However, first hit caching performance is greatly affected by caching of “long-tail” content. As a result, first hit caching yields suboptimal resource utilization. FIG. 1 illustrates the long-tail distribution of content for purposes of explaining its impact on first hit caching.
In FIG. 1, the x-axis represents content that is requested at a caching server over an interval of time. The y-axis represents the number of requests for each item of content during that interval of time. As shown, some percentage of “hot” content 110 is requested frequently and some percentage of content, also referred to as the “long-tail” content 120, is infrequently requested (i.e., once or a small number of times). A caching server performing first hit caching caches all such content the first time it is requested. In so doing, caching servers with scarce cache availability may replace hot content with long-tail content in cache. This in turn increases cache miss rates. This issue can be resolved with added cost to the caching server operator by increasing the available storage at each cache server. Doing so however introduces other inefficiencies and performance degradations that result from caching of long-tail content. Specifically, long-tail content is rarely, if ever, served from cache. Consequently, a caching server wastes resource intensive write operations to cache long-tail content and to purge long-tail content from cache when the content expires. Such extraneous write operations could potentially degrade the responsiveness of the caching server by introducing delay when having to respond to other operations. Such extraneous write operations reduce the ability of the caching server to handle greater loads. Such extraneous write operations also reduce the useful life for the storage hardware at the caching server. Specifically, magnetic disk drives are more likely to suffer mechanical failure sooner and SSDs are more likely to suffer from failing memory cells sooner when performing the extraneous writes associated with caching the long-tail content. Further still, increased disk fragmentation results at the caching server because of the additional writing and purging of the long-tail content. Such disk fragmentation has been shown to slow access to content and thereby degrade caching performance.
To avoid these and other shortcomings associated with first hit caching and, more specifically, the shortcomings associated with caching long-tail content, some CDNs have utilized second hit caching or multi-hit caching that cache content when it is requested two or more times. This avoids caching some of the long-tail content that is requested only once or a few times. However, these multi-hit caching techniques suffer from other shortcomings that reintroduce the tradeoff between performance and cost. Some such shortcomings include increased processor and memory overhead needed to track content request counts, to track when to cache content, and to track what content has been cached. For example, some existing multi-hit caching techniques store the uniform resource locators (URLs) or textual names of the content being requested in conjunction with the number of times that content is requested, thereby imposing onerous memory overhead. As another example, some existing multi-hit caching techniques identify whether content is cached or has been previously requested one or more times with a sorted list or similar structure where the searching of such a structure imposes log(n) complexity and onerous processing overhead as a result. These inefficiencies and overhead increase latency, access times, and overall responsiveness of the caching server, thus offsetting the performance gains that are realized from avoiding caching long-tail content.
Moreover, some second hit caching or multi-hit caching techniques impose added cost in the form of infrastructure modifications and additions that are needed to maintain content request counts and where content is cached. For example, United State Patent Publication 2010/0332595 entitled “Handling Long-Tail Content in a Content Delivery Network (CDN)” introduces a new server, referred to as a popularity server, into existing infrastructure to track the number of times content is requested. In addition to the added costs for deploying and maintaining the popularity server, the centralized framework also introduces performance reducing delay as a result of the communication that occurs between the caching servers and the popularity server.
Accordingly, there is a need to improve CDN performance without increased cost. One specific area of need is to improve cache performance without increasing cost and without offsetting other areas of performance. Specifically, there is a need for an optimized multi-hit caching technique that avoids the performance impact that long-tail content has on cache performance while still achieving similar performance as first hit caching in terms of identifying what content to cache and identifying whether content is cached.