Many Internet users can testify to the utter frustration when a "SERVER NOT RESPONDING" error message is displayed on their browser while trying to connect to a web site. Users often blame the company that administers the unavailable web site, despite large investments in replicated servers. Internet technologies are increasingly used for mission-critical Intranet applications that require high levels of reliability. An intelligently-designed web-site architecture with better fault-tolerance is needed.
The parent application described a server farm that uses a single, virtual IP address for all server machines. The client first connects to a load-balancer at the server farm that receives all incoming packets to the virtual IP address. The load-balancer makes the initial connection with the client, saving the packets used to make the connection. Once the connection is made and acknowledged, the client sends the data request in a packet that contains a Universal Resource Locator, or URL.
The load-balancer extracts the URL from the packet. The URL specifies a file or service at the web-server farm. The load-balancer then selects only those servers that have the resource requested in the URL. Load-balancing is then performed among these servers.
Since load-balancing waits until the URL is received, load-balancing depends on the resource requested, not merely on the load of each server. The server farm can be heterogeneous--the entire content of the web site does not have to be mirrored to each server's hard disk. Some files and resources may be located on a single node or a few nodes. Other web sites resources may include dedicated servers with specific resources such as databases or built-in application-programming interfaces (API's) to interface with user-defined programs, or software licenses to run particular programs on particular servers. Other servers may support the SMTP or POP3 post-office-protocols for e-mail, the NNTP protocol for newsgroups, etc. These specialized resources can reside on just a few of the servers.
TCP State Migration
The connection is transferred from the load-balancer to the assigned server once the URL is extracted. In this process, called TCP state migration, the stored packets from the client are replayed by the load-balancer to the assigned server. The acknowledgement packets from the assigned server are captured by the load-balancer and deleted so that the client is unaware that the connection has been transferred from the load-balancer to the assigned server. Future packets from the client are first routed to the load-balancer, and then passed through to the assigned server. The assigned server uses the virtual IP address instead of its own IP address as the source address for packets returned to the client.
TCP state migration is described in much more detail in the parent, U.S. Pat. No. 5,774,660, hereby incorporated by reference.
Fault-Tolerant Web Site--FIG. 1
The parent application described a fault-tolerant web farm using a back-up load balancer. FIG. 1 is a diagram of a fault-tolerant web site with a back-up load balancer and dual Internet connections. Browser client 10 sends requests through Internet 66 with a virtual IP address for the whole web site. Incoming packets with the virtual IP address are routed to load balancer 70 over local LAN 144. Local LAN 144 may contain routers, switches, and hubs (not shown) when servers are located on separate network nodes. Most large server farms have very complex interconnections within the local LAN network. Local LAN 144 connects to Internet 66 through Internet connection 142 which directly connects to Internet connection router 140, and through Internet connection 148, which is connected to Internet connection router 146.
Connections 142, 148 provide connection to Internet 66. A primary load balancer 70 is used to direct and load balance connections across servers 51, 52, 55, 56. A backup load balancer 70' is also provided to take over operation should primary load balancer 70 fail. These load balancers are located on separate servers to lessen the chance that both fail at the same time. Backup load balancer 70' closely monitors primary load balancer 70 to detect a failure.
It is somewhat undesirable to have a back-up load balancer. If the primary and back-up load-balancers crash or otherwise becomes unavailable, client packets are no longer forwarded to their servers, causing the clients to hang. A technique to avoid this client hang caused by a load-balancer crash is desired.
Incoming Packets Through Load-Balancer
FIG. 1 shows that incoming packets from client 10 are all routed through load-balancer 70, since a virtual IP address is used as the destination address. Load-balancer 70 changes the address of these incoming packets to direct them to the assigned server, such as server 52. Server 52 then sends the data back to client 10, using the virtual IP address as the source.
Although server 52 sends outgoing packets directly to client 10, the incoming packets still all go through load-balancer 70. This still creates somewhat of a network bottleneck, although certainly not as severe as the prior art since only the smaller incoming packets are routed through load-balancer 70. It is desirable to remove this bottleneck.
FIG. 2 shows a client attempting to connect to a failed server. Similar reference numbers are used as described for FIG. 1. Client 10 attempts to open a connection to load-balancer 70, which is migrated over to server 52. Packets from client 10 are not responded to by server 52 since server 52 has crashed. Client 10 receives no reply from server 52, so client 10 eventually displays an error message to the user such as "Server Not Responding". Otherwise, the client times out while waiting for a response, and displays the error message "Server Down or Unreachable", or "Connection Timed Out".
Another common error is that the client uses a stale reference--a URL to a web page that no longer exists or has been relocated. It is desirable to have URL automatically translated to the relocated web page to avoid this error.
It is desired to reduce the frequency of "SERVER NOT RESPONDING" and "SERVER TIMED OUT" messages that Internet users often receive. A more efficient and fault-tolerant web-site architecture that avoids the data bottleneck and single point of failure at the load-balancer at the web site is desired. It is desired for the client to perform load-balancing or server assignment and for the TCP connection to be migrated from the client to the assigned server. Client-side server-assignment is desirable.
WAN load balancing is also desirable. To minimize client latency the client should be served by the best responding server. Responding is a function of both server load and network latency. Minimizing latency is desirable since minimal-latency paths tend to go around Internet bottlenecks, because server routes though bottlenecks are slower. This improves the overall performance of the Internet and/or company WAN links.