The present invention is related to computer networks, and particularly to routing of data units in computer networks.
The flow of traffic in computer networks such as the World Wide Web is limited by the performance of host computers. Each host computer interacts with multiple client computers. For example, a host computer that hosts an Internet Web Site responds to requests from multiple client computers by transmitting the data that comprises the pages of the web site to the client computers. However, the number of client computers that can be contemporaneously serviced by the host computer is limited. Hence, access to the web site is limited by the performance of the host computer.
It is known to employ a distributed system to increase the load handling capability of a web site. The distributed system includes a xe2x80x9cclusterxe2x80x9d of host computers, each of which is capable of servicing requests from clients and may be assigned a distinct Internet Protocol (xe2x80x9cIPxe2x80x9d) address. To access a web site that is associated with a cluster of hosts on the World Wide Web, a string of text known as a xe2x80x9cdomain namexe2x80x9d that identifies the web site is initially entered at the client computer. The domain name is employed to obtain an IP address that is associated with one of the host computers in the cluster. The IP address may be obtained from a mapping that is stored by the client computer or an intermediate network device which maintains mappings of domain names to IP addresses based on information that is obtained from a xe2x80x9croot host computer.xe2x80x9d The root host computer encourages distribution of the load within the cluster by advertising the different IP addresses of the hosts in the cluster to intermediate network devices in a round-robin manner. The intermediate network devices cache the mappings that are advertised by the root host to facilitate timely response. However, caching of IP addresses by clients and intermediate network devices can cause an imbalance of the load within the cluster. For example, if a gateway device caches the IP address of a particular host in the cluster, every client that obtains an IP address for the web site from that gateway device transmits requests to a single host within the cluster. This can create a serious imbalance in the case of a gateway device that handles all traffic for a particular country or region.
It is known to employ a connection router to balance the load in the cluster. The connection router is a device that is coupled between the clients and the cluster. The root host computer advertises the IP address of the connection router so that all clients employ the IP address of the connection router when transmitting requests to the cluster. The connection router monitors the activity of each host in the cluster and selectively distributes the requests to the hosts in a manner which tends to balance the load. However, connection routers inhibit efficiency and scaling because at least one connection router is required even if only a small number of hosts are needed, and no more than a predetermined number of hosts can be supported by a single connection router. Further, the entire cluster becomes crippled when the connection router fails.
In accordance with the present invention, routing functions for a group of computers are distributed among the computers in the group. In the case of a cluster of host devices that host a Web Site, each host device is capable of both servicing requests from client devices and rerouting requests to other host devices in the cluster to promote load sharing. Layer 4 routing functions can be distributed to obviate the need for a connection router to achieve load balancing. Layer 3 routing functions can be distributed to reduce reliance on routers in general.
Distributing routing functions among each host device in the cluster facilitates scaling. Increasing the number of host devices in the cluster results in a proportional improvement in performance because each new host device adds both servicing and routing capability. Further, the size of the cluster is not constrained to a fixed upper limit.
Distributing routing functions among each host device in the cluster makes the cluster less susceptible to catastrophic failure. In particular, the failure of a host device results in a proportional degradation in the capacity of service that can be provided by the cluster. Further, operation of the cluster as a whole is not completely dependent upon any single device.
Distributing routing functions facilitates transparency. Client devices are not exposed to design internals, and cannot distinguish and target individual devices in the cluster. Efficiency is also facilitated because the capacity of the cluster to service and route requests is approximately equal to the total capacity of the constituent host devices, regardless of the number of hosts in the cluster.