As use of networks, such as the Internet or the Web, expands, on-line companies are finding it necessary to provide multiple large-scale data centers to deliver efficient and reliable service to their customers.
FIG. 1 illustrates an example of architecture 10 involving an Internet Protocol (IP) network 30 to which a client device 20 is linked via a communication link 22. A content owner server 60 is linked to the IP network 30 via a communication link 62. Generally, a network address, such as a Universal Resource Locator (URL), is resolved through Domain Name System address resolution to a network IP address for the content owner server 60. For example, www.3com.com will be resolved to a particular network IP address for a content server. Also accessible via the IP network 30 are a data center A 40 and a data center B 50, which are coupled to the IP network 30 via communication links 42 and 52, respectively.
Typically, a client request is addressed to a network address corresponding to an access service provider or a content owner's central site, e.g. www.3com.com. The client's request is then forwarded to a data center for further processing. In FIG. 1, a request 24 from the client device 20 is addressed to the content owner's URL, which is then resolved to the network IP address of the content owner server 60. The request from the client device 20 is then forwarded to the data center A 40 selected by the content owner server 60 based on a global server load sharing approach.
A global server load sharing approach must first address the global issue of which data center to forward the client's request. The data centers 40 and 50 may be geographically distant from one another, the content owner server 60, and the clients being serviced. The centers may be located hundreds or thousands of miles away and may even be located on different continents. The data centers may be topologically distant as well, meaning that they may be many hops away from one another over the same Internet Protocol (IP), RFC 791, backbone network, or they may be served by entirely different backbone networks coupled together via intermediate gateways or interconnecting networks. In other words, the IP network 30 may actually be composed of multiple networks that are interconnected.
The load sharing approach must then address the local load sharing issue of which server device to allocate to the request. Each data center 40 and 50 typically includes a server farm, a network site composed of a large number of server devices that processes client transactions.
There are many solutions for global server load balancing. A Domain Name System (DNS) based solution involves a DNS database that includes proximity and load information of potential servers. See RFCs 1034 and 1035. The client device's local DNS includes this information and either proxies or forwards the client's DNS request to the appropriate data center's DNS server based on the proximity or load information. The DNS server then replies with an individual server's IP address that is selected based on the proximity and load information. The prior art systems propagate DNS updates across wide area networks that require timing out cached entries. However, the complex matter of how DNS caching and timeout of the load and proximity information is handled has not yet been resolved by the Internet Engineering Task Force (www.ietf.org).
Another approach is Host Route Injection (“HRI”), wherein load balancing routers or another type of entity injects weighted routes into Border Gateway Protocol (BGP), RFC 1771, or Interior Gateway Protocol (IGP), RFC 1371, routing databases, i.e. Open Shortest Path First (OSPF) RFC 2328. This approach also works reasonably well, but suffers from some drawbacks. For example, the routing table size grows with the number of clients since routes are created for each client and must be updated all the time. If the routes are not frequently updated, the routing table includes less accurate load and proximity information. Also, fine-grained load balancing may be difficult, because routing protocols require time, on the order of minutes, to converge. Further, route updates may cause route flapping.
Yet another global load balancing approach involves triangle data forwarding. In this approach, all client requests are forwarded to a single data center, and that data center decides whether to serve the requests or forward them to another data center that has closer proximity or a lower load level. This approach can place an unnecessary burden on the forwarding data center. Also, triangle routing adds latency and inefficiency to the communication stream. Furthermore, if the primary data center fails, then another mechanism, such as DNS, is needed to prevent system failure and provide high availability.
Thus, the need still remains for an improved method for global server load balancing.