1. Technical Field
The present invention relates generally to techniques for reducing traffic to origin servers for very popular and large, potentially flash-popular objects.
2. Description of the Related Art
It is well-known to deliver digital content (e.g., HTTP content, streaming media and applications) using an Internet content delivery network (CDN). A CDN is a network of geographically-distributed content delivery nodes that are arranged for efficient delivery of content on behalf of third party content providers. Typically, a CDN is implemented as a combination of a content delivery infrastructure, a request-routing mechanism, and a distribution infrastructure. The content delivery infrastructure usually comprises a set of “surrogate” origin servers that are located at strategic locations (e.g., Internet network access points, Internet Points of Presence, and the like) for delivering content to requesting end users. The request-routing mechanism allocates servers in the content delivery infrastructure to requesting clients in a way that, for web content delivery, minimizes a given client's response time and, for streaming media delivery, provides for the highest quality. The distribution infrastructure consists of on-demand or push-based mechanisms that move content from the origin server to the surrogates. An effective CDN serves frequently-accessed content from a surrogate that is optimal for a given requesting client. In a typical CDN, a single service provider operates the request-routers, the surrogates, and the content distributors. In addition, that service provider establishes business relationships with content publishers and acts on behalf of their origin server sites to provide a distributed delivery system. A well-known commercial CDN service that provides web content and media streaming is provided by Akamai Technologies, Inc. of Cambridge, Mass.
It is desirable to reduce wide-area network bandwidth and the load on a content provider's origin server as much as possible. To this end, the prior art has proposed a hierarchical proxy cache architecture wherein caches resolve misses through other caches higher in a hierarchy. This architecture is described, for example, in a paper titled A Hierarchical Internet Object Cache, to Danzig et al., 1996 USENIX Technical Conference, Jan. 22-26, 1996, San Diego, Calif. In this approach, each cache in the hierarchy independently decides whether to fetch a requested reference from the object's home site or from its parent or sibling caches using a resolution protocol. According to the protocol, if the URL identifying the reference contains any of a configurable list of substrings, then the object is fetched directly from the object's home, rather than through the cache hierarchy. This feature is used to force the cache to resolve non-cacheable URLs and local URLs directly from the object's home. If the URL's domain name matches a configurable list of substrings, then the object is resolved through the particular parent bound to that domain. Otherwise, when a cache receives a request for a URL that misses, the cache performs a remote call to all of its siblings and parents, checking if the URL hits any sibling or parent. The cache then retrieves the object from the site with the lowest measured latency.
While the cache hierarchy described in the above-identified publication provides benefits in the form of reduced access latency to the home site, the scheme is overly complex and costly (in terms of network bandwidth) due to the requirement of measuring latency between the cache and all of its siblings and parents upon a cache miss.