1. Field of the Invention
This invention relates to a method and apparatus for determining approximate network distances using one or more network reference locations or points.
2. Background Art
Computer systems sometimes rely on a server computer system to provide information to requesting computers on a network. When there are a large number of requesting computers, it may be necessary to have more than one server computer system to handle the requests. In prior art systems, there is a problem in efficiently directing requests to the correct server in a multiple server system.
One area where this has been a problem is on the Internet. The problem can be better understood by reviewing the structure and operation of the Internet.
The Internet is a worldwide network of interconnected computers. An Internet client accesses a computer on the network via an Internet provider. An Internet provider is an organization that provides a client (e.g., an individual or another organization) with access to the Internet (via analog telephone line or Integrated Services Digital Network line, for example). A client can, for example, read information from, download a file from or send an electronic mail message to another computer/client using the Internet.
To retrieve a file or service on the Internet, a client must search for the file or service, make a connection to the computer on which the file or service is stored, and download the file or service. Each of these steps may involve a separate application and access multiple, dissimilar computer systems. The World Wide Web (WWW) was developed to provide a simpler, more uniform means for accessing information on the Internet.
The components of the WWW include browser software, network links, servers, and WWW protocols. The browser software, or browser, is a user-friendly interface (i.e., front-end) that simplifies access to the Internet. A browser allows a client to communicate a request without having to learn a complicated command syntax, for example. A browser typically provides a graphical user interface (GUI) for displaying information and receiving input. Examples of browsers currently available include Mosaic, Netscape Navigator and Communicator, Microsoft Internet Explorer, and Cello.
Information servers maintain the information on the WWW and are capable of processing a client request. The Hypertext Transfer Protocol (HTTP) is the standard protocol for communication with an information server on the WWW. HTTP has communication methods that allow clients to request data from a server and send information to the server.
To submit a request, the client contacts the HTTP server and transmits the request to the HTTP server. The request contains the communication method requested for the transaction (e.g., GET an object from the server or POST data to an object on the server). The HTTP server responds to the client by sending a status of the request and the requested information. The connection is then terminated between the client and the HTTP server.
A client request thus consists of establishing a connection between the client and the HTTP server, performing the request, and terminating the connection. The HTTP server does not retain any information about the request after the connection has been terminated. HTTP is, therefore, a stateless protocol. That is, a client can make several requests of an HTTP server, but each individual request is treated independently of any other request. The server has no recollection of any previous request.
An addressing scheme is employed to identify Internet resources (e.g., HTTP server, file or program). Identifiers used in this addressing scheme are called Uniform Resource Locators (URL). A URL may contain one or more of the protocol to use when accessing the server (e.g., HTTP), the Internet domain name of the site on which the server is running, the port number of the server, and the location for the resource in the file system of the server.
The WWW uses a concept known as hypertext. Hypertext provides the ability to create links within a document to move directly to other information. To activate the link, it is only necessary to click on the hypertext link (e.g., a word, phrase or an image). The hypertext link can be information stored on a different site than the one identifying the location for the additional information. When the link is activated, the client""s browser uses the link to access the data at the site specified in the URL.
An HTTP server also has the ability to delegate work to gateway programs. The Common Gateway Interface (CGI) specification defines a mechanism by which HTTP servers communicate with gateway programs. A gateway program is referenced using a URL. The HTTP server activates the program specified in the URL and uses CGI mechanisms to pass program data sent by the client to the gateway program. Data is passed from the server to the gateway program via command-line arguments, standard input, or environmental variables. The gateway program processes the data and returns its response to the server using CGI (via standard input, for example). The server forwards the data to the client using the HTTP.
A computer user navigates the Internet or web from a browser on a computer system. To access a web site, the user enters a URL containing the host name of the web site into the browser. This can be accomplished by clicking on a link, by activating a tool bar button, or by manually entering a name or address into a location field. The URL that is entered is not the actual Internet Protocol (IP) address of the intended web server. The actual IP address is a string of numbers that uniquely locate the web server that provides the web site data. A worldwide distributed database system, called the xe2x80x9cDomain Name Systemxe2x80x9d (DNS) provides the mapping between server names and the associated IP addresses.
Client application software, such as a web browser, typically uses a local library, called the xe2x80x9cDNS resolverxe2x80x9d to obtain the translation from server name to IP address. The resolver in turn contacts a predetermined local DNS server to obtain the translation. DNS servers can maintain caches of previously resolved names. More specifically, name resolution processes typically require two hosts on the client side. Consider a user working on xe2x80x9casha.eng.sun.comxe2x80x9d that wants to get the address of xe2x80x9cwww.uspto.govxe2x80x9d. The client browser will communicate with a local resolver (a library attached to the browser process itself, in the current example running on asha.eng.sun.com). The local resolver will go to one of a relatively small number of local name servers, e.g. xe2x80x9cns.xyz.comxe2x80x9d. Here ns.xyz.com is called the client side name server. The client side name server will communicate with the outside world to determine the IP address of www.uspto.gov, and forward this information to the resolver that is part of the browser process.
DNS comprises a global network of servers that translate host names into Internet Protocol or IP addresses and provides IP addresses to name mapping as well.
Once the IP address is known, the browser communicates with the web server at that address to retrieve the requested web page or other information.
An Internet server is typically limited in the number of clients it can efficiently service at any one time. However, the owner of an Internet site does not want users to be denied access to their Internet site. The Internet site owners desire all attempted accesses to be successful, especially the popular ones.
To provide such service, some companies have implemented systems that allow multiple Internet servers to service requests for a single Internet site. If there are two servers, it would be expected that each server would service approximately half of the client requests to the supported Internet site. This concept is known as xe2x80x9cdistributed serversxe2x80x9d, and specifically in this case xe2x80x9cdistributed Internet servers.xe2x80x9d
There are a number schemes for implementing the distributed Internet server. The schemes involve the manner in which requests to the Internet site address are routed to the multiple servers. One such scheme is called DNS shuffle address or xe2x80x9cround-robin.xe2x80x9d In this scheme, as each request comes to the Internet site address, the servers that respond are rotated in some order. If there are three servers in the distributed system, then any one of the servers handles approximately every third request. This scheme has a disadvantage of ignoring load balancing considerations and traffic localization considerations.
Another scheme uses a freely available script call xe2x80x9clbnamedxe2x80x9d that provides a DNS server with the ability to return a different IP address for every client request received for a Internet site host address. The returned IP address can be made to depend on server load as well as availability of local Internet servers, but ignores the relative distance between clients and the available servers.
Another known approach is called xe2x80x9csmart clients.xe2x80x9d The smart client approach is an architecture for web traffic client-server communication that allows for a dynamic server choice based on load and availability. The approach allows for the user of multiple server machines to achieve scalable performance for load balancing, for fault transparency, and backwards compatibility with the existing addressing scheme (URLs). The architecture requires client web browsers to execute downloadable, server specific code. This code is divided into a GUI thread and director thread. Server choice, load balancing, and fault transparency are encapsulated by the director thread. A disadvantage of this scheme is that it requires cooperation of requesting clients. It imposes extensive overhead on single web page retrieval and traversal to new sites.
Another scheme which has been proposed is called SONAR. SONAR is a service arranged to provide information to an application which the application can use to make a good choice about which of a variety of resources to access or utilize. The SONAR service does not specify use of any particular means of estimating network proximity. Instead, the SONAR service assumes that the best means of estimating network proximity will change over time and from network to network. In addition, information provided by one SONAR service server though not be compared to the information provided by another server, since different SONAR servers providing the service information may employ different proximity information.
The invention is a method and apparatus for determining the approximate distance between a first point and a second point of a network using one or more reference points.
In accordance with one embodiment of the invention, the method comprises the steps of selecting at least one reference point frome nodes of the network, obtaining first distance metric information associated with at least one path associating the first point and the at least one reference point, obtaining second distance metric information associated with at least one path associating the second point and the at least one reference point, and determining a total approximate distance for at least one path associating the first point and the second point based on the first and second distance metric information.
In accordance with one embodiment of the method the first point comprises a client, each second point comprises a server, and the one or more reference points comprise other nodes of the network.
In one or more embodiments, the second distance metric information is published to one or more servers providing domain name mapping information and the second distance metric information is provided to the client (or second point) along with requested domain name mapping information.
In accordance with one embodiment of the invention, sets of total approximate distance information are generated which are associated with network distances between a client and at least two servers, and an optimal server is identified by ranking the sets of generated total distance information.
In one or more embodiments, computer hardware and/or software is arranged to perform the method of the invention.
Further objects, features and advantages of the invention will become apparent from the detailed description of the drawings which follows, when considered with the attached figures.