1. Technical Field
The present invention relates generally to content delivery in distributed networks.
2. Brief Description of the Related Art
Distributed computer systems are well-known in the prior art. One such distributed computer system is a “content delivery network” or “CDN” that is operated and managed by a service provider. The service provider typically provides the content delivery service on behalf of third parties. A “distributed system” of this type typically refers to a collection of autonomous computers linked by a network or networks, together with the software, systems, protocols and techniques designed to facilitate various services, such as content delivery or the support of outsourced site infrastructure. Typically, “content delivery” means the storage, caching, or transmission of content, streaming media and applications on behalf of content providers, including ancillary technologies used therewith including, without limitation, DNS query handling, provisioning, data monitoring and reporting, content targeting, personalization, and business intelligence. The term “outsourced site infrastructure” means the distributed systems and associated technologies that enable an entity to operate and/or manage a third party's web site infrastructure, in whole or in part, on the third party's behalf.
Cold content is any web site content that is requested infrequently by end users and, as a result, is unlikely to stay in CDN cache long enough to improve origin off-load. Typically, a CDN content server removes content based on some type of caching rule, such as a least-recently-used (LRU) basis. This means that the infrequently requested objects are generally the first ones removed from a CDN server cache to make room for new content. Unfortunately, in the case of CDN customers with large quantities of cold content, it is likely that one object will be replaced with another equally cold object. Social networking and auction sites are particularly likely to have this problem because they have vast amounts of content but only a very select subset of that content is of interest to a broad range of users. This is sometimes called the “long-tail” problem.
For example, consider a site with one terabyte of content. Of this content, assume that 500 MB is requested more than once per week on any given CDN edge server. The remaining 999.5 gigabytes, however, is requested at most once per week. This large amount (in this example, 999.5 gigabytes) of content is the so-called “long tail.” It is content that is “cold.” Of course, the numbers given above are merely illustrative. A long tail situation may be deemed to exist with respect to a particular site for which any given percentage (e.g., 90% or more) of the content will rarely be requested.
Cold content in general, and long tails in particular, present some special challenges for a CDN service provider, including low origin server off-load (low cache hit rates) due to the content being evicted from cache before it can be requested again, cache contention and the potential to monopolize the cache to the detriment of other CDN customers, and sensitivity to load spikes that can occur with purges or CDN server region outages.