The present invention is directed to communications networking. It particularly concerns domain-name servers' network-address choices.
For internetwork communications, network nodes typically transmit information in packets. The packet format is specified in part by an applicable network-level protocol, and that format includes a network-address field that identifies a node interface to which the packet is directed. A protocol typically used for this purpose is the Internet Protocol ("IP"), which is defined in the Internet Community's Request for comments ("RFC") 791. The network address is a four-byte (or, in Internet parlance, four-"octet") address. The network address is typically rendered in a literature as the four bytes' respective decimal representations separated by periods, e.g. "142.75.229.5."
But human beings do not ordinarily employ such addresses directly. Instead, they employ host names, which are more often composed of letters, e.g. "hostdomain.com" and are therefore more easily remembered. Once the user gives the host name to his node system, however, that system must translate the name into a network address to perform the actual communication. This necessitates translating from the host name to the internet address. In the usual case, the user's node does not store the necessary translation information locally, so it communicates, typically by means of the DNS (Domain Name System) protocol described in RFCs 1034 and 1035, with a remote domain-name server that has the necessary translation information for the domain involved. When the remote domain-name server receives the DNS request, it looks the submitted domain name up in a table that associates network addresses with host names, and it sends the requested address back to the requesting node.
In most cases, such as in most e-mail operations, there is a one-to-one relationship between a host name and an internet address. But this is not always so. Sometimes the host actually has interfaces on more than one network, and those interfaces would necessarily have different network addresses. Conversely, heavily used "web" sites may actually be implemented more or less identically in several web servers, which should all be identified by the same name (even if the network nodes that embody the servers additionally have different names associated with other services). The present discussion is directed to the latter situation.
In some cases, such commonly named servers are dispersed geographically with the intention of reducing communications costs by having the web clients directed to the nearest application server. In those cases, the name server may select the application server that appears closest to the source of the DNS request, and it gives the selected site's network address as the response to the DNS request. But multiple servers may be required even if there is no such geographical imperative. If a "site" is heavily used, for example, a single host may not be able to handle the load adequately, so multiple hosts would be desirable even if they are not located at any great distance from each other. In such cases, the name server divides the traffic up among the web servers in accordance with a policy intended to be "fair" in some sense.
To this end, a good if not completely accurate assumption is that client-server traffic occurs in transactions. That is, a client that has made a request of the server is assumed to make no further requests until it receives the server's response. The response time, or "latency" between the request and the response is not in general the same for different servers, and one approach to fairness is for the name server to direct the client to the web server whose latency is lowest. To determine that latency, the name server may send each of the commonly named application servers an IP datagram containing an Internet Control Message Protocol ("ICMP") message, described in RFC 792, of the so-called "ping" type. This type of message merely requests that the recipient return a corresponding response to show that it is functioning. The time that elapses between the ping transmission and the resultant response is a rough latency measurement, and the address-requesting client is directed to the web server whose thus-measured latency is lowest.
But there are many situations in which the ping approach does not balance server load particularly well. An apparent reason is that the latency measured by the ping approach does not necessarily correlate well with the latency that the service's clients experience. The client's network distance from a server may differ from the name server's distance. Also, a web server that responds rapidly to a ping request, which is serviced low in its protocol stack, may appear more lethargic in response to, say, a client of its document-retrieval process.
For these or other reasons, superior load balancing more often results from simply allocating load on a weighted random basis. (The weighting is typically based on the sites' respective capabilities.) But that approach's results, too, leave room for improvement.