1. Field of the Invention
The present invention relates to transmission of data in a network environment. More specifically, the present invention relates to a technique for improving load balancing of traffic in a data network.
2. Description of Related Art
With the recent explosive growth of the Internet, some Internet sites have experienced a very high demand for their services. Many busy sites require multiple servers to adequately service their demands. It is not uncommon for 20 or 30 servers to be dedicated to a given site. Additionally, many of today's Internet content providers such as, for example, Yahoo.com, utilize a load-balanced server system in order to quickly provide desired content to a plurality of different users at substantially the same time. A block diagram of a conventional load-balanced server system is illustrated in FIG. 1 of the drawings. As shown in FIG. 1, site 110 on the World Wide Web may be implemented using a load-balanced server system in order to respond to data requests from client devices 102 via the Internet 104. The load-balanced server system includes a load balancing device 106, and a plurality of server devices 108. The load balancing device 106 may be configured to perform the functions of a virtual server. When the virtual server receives a data request from the client device 102, it forwards the request to an appropriate server in the server farm 108.
A common way of implementing a server cluster (108) is to use a virtual IP address (VIP) that uniquely identifies a set a servers for particular site (110). Typically, a front-end load balancing device 106 is in charge of advertising the VIP to clients 102; receiving service requests (e.g., TCP connection set-up requests) from clients; dispatching service requests based on server load, requested port, etc.; and remembering the selected server for a given connection/service/session so that subsequent traffic of that connection can be serviced by the same selected server.
It will be appreciated, however, that conventional load balancing techniques do not offer an efficient mechanism for a load balancer to select an appropriate server based on the client identity, or to prioritize requests based on the client identity. One reason for this is that such decisions must be performed very early in the service flow (e.g., at the time when the load balancer receives an initial TCP SYN packet of a TCP connection setup), where very little, if any, information is available about the client. In fact, in most cases, the only information which is available to the load balancer at the initial stage of a service request flow is the client IP address.
Unfortunately, in many cases, the client IP address has been allocated dynamically by some remote DHCP server (unknown from the cluster receiving the request). In other cases, the client IP address is replaced with a proxy IP address corresponding to a proxy-server which may be sitting between the client and the load balancer. Moreover, even if a client were to obtain a static, global IP address, and was able to communicate with the site 110 without using a proxy server, it would be impractical for the load balancer to associate all client IP addresses with respective user profiles so that, for example, Quality of Service (QoS) decisions and/or server selection may be performed by accessing a global directory to obtain desired user profiles of desired global IP addresses.
Accordingly, a common technique used by conventional load balancers to select an appropriate server based on the client identity is to terminate the service request at the load balancer (instead of simply routing the service request), and then collect additional information from the client to be used to in performing its load balancing operations such as, for example, server selection. An example of such a technique is described below.
Typically, when a user at client machine 102 desires to access information from a particular website (e.g., site 110), the user will enter the domain name (e.g., www.cisco.com) at the browser of the client machine (“client”) 102. The client 102 will then send a DNS query to a DNS server (e.g., DNS server 112) in order to obtain an IP address associated with the specified domain name. In the example of FIG. 1, assuming that client 102 desires to communicate with site 110, the DNS server 112 will provide client 102 with a virtual IP address associated with site 110. Using the virtual IP address obtained from DNS server 112, client 102 then attempts to establish a TCP connection with site 110, for example, by sending a TCP SYN packet to a destination address corresponding to the virtual IP address.
The TCP SYN packet from client 102 will be received at load balancer 106. Eventually, the load balancer 106 will select an appropriate server from server farm 108 for responding to requests from client 102. However, in order to select an appropriate server, the load balancer 106 will typically require additional information about client 102. For example, the load balancer may base its server selection upon the client identity. However, as described previously, little, if any, information is typically available about the client in the TCP SYN packet. Accordingly, in order to learn more information about client 102, the load balancer 106 accepts the TCP SYN packet and does not forward the TCP SYN packet to any of the servers in the server farm 108. Rather, the load balancer responds with a TCP SYN-ACK packet, which results in a TCP session being established between client 102 and load balancer 106.
The client 102, believing that it is now communicating with a server at site 110, may then transmit an HTTP request to the load balancer 106. When the HTTP request is received at the load balancer, the load balancer may then identify additional information about the client in order to select an appropriate server for servicing the HTTP request. Once an appropriate server has been selected, the load balancer may then forward the HTTP request to the selected server. In this scenario, the load balancer will function as a middle man for relaying communications between client 102 and the selected server. Typically, such a situation is undesirable since, for example, it increases the traffic and processing load on the load balancer, and further increases the response time experienced by the user (since, for example, all of the client's HTTP requests must first be processed by the load balancer before being processed by the server).
Alternatively, once the load balancer has selected an appropriate server for responding to client 102's HTTP request, the load balancer may break the TCP connection with client 102, and send a redirect request so that the client communicates directly with the selected server using an IP address uniquely associated with the selected server. It will be appreciated, however, that this alternate scenario also burdens the resources at the load balancer since, for example, the load balancer is required to establish a TCP connection with the client, and then terminate the TCP connection and send a re-direct request to the client once an appropriate server has been selected. Additionally, the wait time on the client end is increased since, for example, the client 102 must go through the process of establishing a TCP connection with the load balancer 106, sending an HTTP request to the load balancer, breaking the TCP connection with the load balancer, and establishing a new TCP connection with the redirected server.
In light of the above, it will be appreciated that there exists a continual desire to improve network load balancing techniques in order to overcome at least a portion of the problems associated with conventional load balancing techniques.