The present invention relates to selecting a server based on multiple metrics. More particularly, the present invention provides a method and system for using metrics and significance windows for locating an optimal content server. The frame of reference for the present invention is a network node where multiple servers for the same content can be identified. This network node may be a system collocated with a domain name server identifying multiple IP addresses for a given domain name.
A network user can often retrieve identical content from a number of mirror sites. Content is often distributed onto mirroring sites across a network such as the Internet in order to give users optimal access to the information. A number of techniques have been used to select a server that can best provide the content to the user.
One method gives much of the discretion to the user. Users wishing to download a piece of software may be directed to select a server location closest to their own location. Several problems arise out of this technique. Geographic proximity may not be equivalent to network proximity. Two cities may neighbor each other geographically, but the cities may not have a direct network connection. Network traffic may flow through a geographically distant third city. Alternatively, the network lines connecting the two neighboring cities may have limited bandwidth, while the lines connected to the distant third city may have bandwidth to spare.
This method does not take into account server loads. A nearby, but overloaded, server may not be able to provide content as quickly as a distant server with minimal load. This method also does not consider dynamic network traffic loads. One route may be optimal at a particular time of day, while suboptimal at others times.
Some techniques for identifying a best server employ systems at a primary domain name server. When a user requests particular content, these systems interact with the primary domain name server to identify a content server. The system identifies the best content server and then returns the IP address of that server as part of its DNS reply. These systems may select a server from a list of identical servers randomly. This works to prevent any particular content server from carrying a disproportionate load of network traffic. This system may balance server load across a list of content servers, but it does not assure that the user receive service from a content server that would provide the optimal response. In fact, it does not even prevent the user from receiving service from a content server that would provide the slowest response. This system may in fact worsen loads at already congested content servers. The random number generator for the system may exhibit non-random characteristics, and still direct a disproportionate amount of traffic towards particular content servers.
Other systems have incorporated metrics into their determination of what constitutes an optimal server for providing content to a particular user. These metrics include round trip time, server load, drop rate, available bandwidth, administrative distance, number of hops, and whether or not a server is in a particular subnetwork. The round trip time refers to the total time a packet takes to reach its destination and return to its original source location. Server load corresponds with resources such as processing and memory capacity available on a content server. Drop rate gives the percentage of packets dropped during a single round trip. Administrative distance incorporates the idea that it may be less expensive to travel across network nodes that are owned by the user""s service provider than to travel across third party owned ones.
Some of these systems rely on one particular metric to determine an optimal server. However, no single metric is the key to determining what server can best provide content to a particular user. For example, the round trip time may seem to be key to providing the optimal server, but when the load for the server with the fastest round trip time is near capacity, the server""s response time may be equal or greater than the network transmission time. Once a server reaches capacity, the performance drop off can be significant. Alternatively, TCP/IP does not function well if the drop rate is above 10%. The round trip time for a single packet may be fast when the drop rate is 15%, but the need to wait for the retransmission of lost packets results in data transmission at a much slower effective rate. From a different vantage point, a particular server may not have the best value for any particular metric, but it still may be the best server when all metrics are considered.
Combinations of metrics have been pursued as a possible solution to a finding an optimal server. Several metrics can be combined into a single metric by assigning weights to each measure. The sum of these metrics multiplied by a weighting factor can yield a composite metric. The composite metrics of the servers can then be sorted to identify an optimal server. However, assigning weighting factors to these metrics is at best an inexact science. It can be extremely difficult to assign weights to particular metrics, particularly when the weighting of the metric should not necessarily vary linearly.
For example, response time as a function of server load can vary exponentially. When the server has processing and memory resources to spare, server load should only be a small part in the determination of what the optimal server should be. The round trip time should be a more important consideration when server load is low. However, when a server is running low on resources, server load should be a much more important consideration. Unfortunately, this phenomenon can not easily be expressed as a linear equation. A higher order equation is required. Determining what order equation this composite metric should be based on and identifying the appropriate multipliers is an extremely difficult determination, especially when considering the variability of metrics. One set of multipliers may be appropriate for certain types of data, while others may be more appropriate for a particular set of content servers. Determining multipliers for higher order composite metric equations is not a simple proposition.
Another technique for increasing the efficiency with which data requests are serviced is described in commonly assigned, copending U.S. patent application Ser. No. 09/606,418 for WIDE AREA LOAD BALANCING OF WEB TRAFFIC filed Jun. 28, 2000, the entirety of which is incorporated herein by reference for all purposes. The copending application describes the Boomerang process where in a specific embodiment, each site with an IP address corresponding to a domain name is requested to respond to a Domain Name System (DNS) query. The first server to complete a response through the network lines is deemed to be the optimal server. The Boomerang process identifies the server with the lowest network delay between itself and the DNS server at the moment of transmission.
This presents an accurate depiction of the round trip time at the moment of transmission. However, the same disadvantages for using only one metric to determine an optimal content server may apply here as well. The fact that a server may be close to its maximum load is ignored. Packet drop rate and administrative costs are also not directly considered. The content server that responds to the DNS query first is selected, regardless of how likely a series of packets would be able to arrive at the DNS server through the service provider""s own network nodes.
Each of the currently available techniques for selecting a server to provide content to a user at the best possible rate has its own disadvantages with regard to at least some of the desirable characteristics of server selection systems. It is therefore desirable to provide a system for selecting a content server that exhibits desirable characteristics as well or better than the technologies discussed above.
According to the present invention, a method and apparatus are provided to select from a group of servers a particular server that can provide content to a client in an optimal manner.
Certain network nodes contain the addresses of a group of servers that can provide the same content. In one embodiment, such a network node is a DNS server. When a client submits a domain name to the DNS server, the present invention selects the server corresponding to the domain name that can best provide content to the client. The server selection system of the present invention may be integrated into DNS server, or may be collocated with a DNS server. The server selection system identifies a set of available content servers distinct from the name server and obtains performance metrics for each of these servers. These performance metrics may be obtained in a variety of ways, and can be determined dynamically or referenced from memory or external storage. These performance metrics include round trip time, server load, drop rate, administrative cost, as well as any other system administrator defined measurements.
In one embodiment of the present invention, as soon as the server selection system identifies values for these metrics, it can optionally remove a particular server when there exists another server that is better with respect to every metric. Note that the terms best, better, optimal, and worst are subjective and can be defined on a per-metric basis.
The server selection system also allows the priorities of given metrics to be defined. A system or network administrator can rank the metrics in order of importance. According to specific embodiments, significance windows of varying sizes can be assigned to each metric. The servers can be first analyzed based on the most important metric and its significance window.
The servers that fall outside of this significance window can be eliminated from consideration. The remaining servers are then analyzed based on a second metric. The optimal server based on this metric as well as servers falling within the significance window remain available servers. The others can be dropped. The process can be repeated for each of the remaining metrics until either no metrics remain, or only one server remains. The address of the selected server is returned to the user as the optimal content server.
One aspect of the invention provides a method for selecting a server from a group of servers to provide content to a client in an optimal manner. The method may be characterized by the following sequence: (1) for each server in the group of servers, obtaining values of at least a first performance metric and a second performance metric, wherein the performance metrics correspond with the characteristics of transmission between the servers and network nodes associated with the client; (2) applying a first significance window defining a range of acceptable values of the first performance metric and removing from consideration those servers in the group having poor values of the first performance metric, which fall outside the significance window; (3) applying a second significance window to the servers remaining after applying the first significance window, the second significance window defining a range of acceptable values of the second performance metric; and (4) identifying the best server from among any remaining servers lying within the second significance window.
Multiple significance windows can be applied. Each metric can have a significance window associated with it. If multiple servers remain after the significance windows have filtered out servers with low performance metric values, an optimal server can be selected from the group of remaining servers. One method of selecting this remaining server may be to simply select the server with the optimal final metric value.
Another aspect of the invention provides an apparatus for selecting a server from a group of servers to provide content to a client in an optimal manner. The apparatus may be characterized by the following features: (1) memory configured to store, at least temporarily, values of at least a first performance metric and a second performance metric, wherein the metrics are associated with the characteristics of transmission between the servers and network nodes associated with the client; and (2) one or more processors coupled with the memory, wherein the processors are configured or designed to apply a second significance window to a list of servers falling inside a first significance window and to identify the best server from among any remaining servers lying within the second significance window, wherein the first significance window is defined by a range of acceptable values of the first performance metric for removing from consideration those servers which fall outside the first significance window and the second significance window is defined by a range of acceptable values of a second performance metric for removing from consideration those servers which fall outside the second significance window.
Another aspect of the invention pertains to computer program products including a machine readable medium on which is stored program instructions, tables or lists, and/or data structures for implementing a method as described above. Any of the methods, tables, or data structures of this invention may be represented as program instructions that can be provided on such computer readable media.
A further understanding of the nature and advantages of the present invention may be realized by reference to the remaining portions of the specification and the drawings.