1. Statement of the Technical Field
The present invention relates to the request routing in a content delivery network and more particularly to the autonomic selection of a routing policy based upon the predicted cache effectiveness of the selected routing policy.
2. Description of the Related Art
In the prototypical content delivery system, content can be delivered from an origin server to a community of content consuming clients. Content typically can be delivered according to a request-response paradigm in which the content consuming clients initiate a request for content to which one or more origin servers can respond with the requested content. Generally, one or more content caches can be disposed in the intermediate communications path between the content consuming clients and content servers in order to enhance the responsiveness of the servers to any single client request and to reduce the processing burden placed upon the origin server.
A variety of mechanisms route content request streams through intermediate caches. For instance, content clients may be configured to use a particular cache. Similarly, the content delivery system itself may redirect requests by interposing on DNS translations or by intercepting requests at the IP level. In addition, each cache may control the routing of its own miss stream to other components. The last two years have seen an explosion of growth in content caching and content delivery infrastructure. Key developments include the increased role of surrogate caching among hosting providers and the aggregation of content consumers into large Internet service providers employing transparent interception proxies based upon Layer 7 switches. These developments have fed the growth in demand for Web caching systems.
Today, server farms host many content sites, where a group of servers can be clustered together to act as a unified server to external clients. Any given request could be handled by any of several servers, thereby improving scalability and fault-tolerance. The switching infrastructure connecting the servers to the hosting network generally includes one or more redirecting server switches to route incoming request traced to the servers. Referred to in the art as request distributors, these switches select individual servers to handle each incoming content request. Thus, the server selection policy can play an important role in managing cluster resources in order to maximize throughput and meet quality-of-service goals.
Conventional server switches often incorporate a variety of request routing methodologies when distributing requests to backend server processes. In particular, the server selection methodologies can be selected in order to maximize throughput and minimize response latency. For instance, server load balancing oriented methodologies monitor server status and direct requests to lightly loaded servers. Notably, server load balancing switches often are referred to as Layer 4 switches because server load balancing switches make server selection decisions at connection setup time, and examine only the Layer 4 transport headers of the incoming packet stream.
Content-aware server selection policies, by comparison, prefer servers that can handle a given request most efficiently. Importantly, the most efficient requesting handling servers incorporate caching technology and, accordingly, the server most likely to be able to process a request most effectively is the server likely to have the requested data in cache. Uniform Resource Locator (URL) hashing is a content-based policy that applies a simple deterministic hash function upon the request URL to select a server. URL hashing has often been referred to as a Layer 7 policy because the URL hashing switch typically parses protocol headers at Layer 7 in order to extract the respective URL.
Observations of content request patterns drive the design choices and policies for all of these components of a content delivery architecture. In particular, a number of studies indicate that requests to retrieve static Web objects follow a Zipf-like popularity distribution. Specifically, in accordance with Zipf, the probability pi of a request for the ith most popular document is proportional to 1/iα for some parameter α. In this Zipf-like distribution, a large number of object requests typically target the most popular object sources and the most popular objects within those sources. The Zipf-like distribution, however, also includes a long, heavy tail of less popular objects with poor reference locality. Notably, higher α values increase the concentration of requests on the most popular objects. One implication of the Zipf-like behavior of the Web is that caching is highly effective for the most popular static, and thus cacheable objects, assuming that popularity dominates rate of change. Unfortunately, caching is less effective in respect to the heavy tail of the distribution, which comprises a significant fraction of requests. Hence, Web cache effectiveness typically improves only logarithmically with the size of the cache, measured either by capacity or by user population.
Zipf-like behavior also has implications for selecting a request routing policy in a server cluster. For example, the Zipf-like behavior of the Web creates a tension between the competing goals of load balancing and locality. On the one hand, content-aware policies such as Layer 7 URL hashing effectively take advantage of the locality present in the request stream by preferring the same server for repeat requests, maximizing server memory hits for popular objects. However, Layer 7 URL hashing remains vulnerable to load imbalances because the most popular objects receive the largest number of requests, and a single server handles all requests for any given object. Layer 4 type server load balancing policies balance load, but Layer 4 type server load balancing policies tend to scatter requests for each object across the servers, reducing server memory hits in the cache for moderately popular objects.
Recent research has studied this tradeoff in depth, and has resulted in the development of the Locality Aware Request Distribution policy and related policies to balance these competing goals, combining the benefits of each approach. Other commercial request distributors use less sophisticated strategies such as assigning multiple servers to each URL hash bucket, and selecting from the target set using load information. In either case, however, the skilled artisan will recognize the importance of selecting a suitable routing policy at design time. Accordingly, the conventional selection of a particular routing policy often can depend upon the goals of the systems architect when the system is configured. Predicting the actual requirements of the system at design time, however, can be difficult for most. Moreover, whereas optimally selecting a suitable request routing policy can be problematic generally, in an autonomic system, the problem can be particularly acute.
For the uninitiated, autonomic computing systems self-regulate, self-repair and respond to changing conditions, without requiring any conscious effort on the part of the computing system operator. To that end, the computing system itself can bear the responsibility of coping with its own complexity. The crux of autonomic computing relates to eight principal characteristics:    I. The system must “know itself” and include those system components which also possess a system identify.    II. The system must be able to configure and reconfigure itself under varying and unpredictable conditions.    III. The system must never settle for the status quo and the system must always look for ways to optimize its workings.    IV. The system must be self-healing and capable of recovering from routine and extraordinary events that might cause some of its parts to malfunction.    V. The system must be an expert in self-protection.    VI. The system must know its environment and the context surrounding its activity, and act accordingly.    VII. The system must adhere to open standards.    VIII. The system must anticipate the optimized resources needed while keeping its complexity hidden from the user.Thus, in keeping with the principles of autonomic computing, request routing methodologies ought to change as the impact of selecting any one methodology over the other becomes more advantageous for the operation of the system.