The expansive growth of the Internet has led to a significant transition in the way people communicate and exchange information within our society. Conventional communication tools such as handwritten letters, telephones, and fax machines have been gradually replaced as the primary means of information exchange due to the high availability and popularity of internet based tools such as e-mail messaging and the World Wide Web. Today, the Internet is a global system of computer networks connecting millions of users worldwide using a simple standard common addressing system and communications protocol called TCP/IP. People and businesses around the world can use the Internet to retrieve information, correspond with other Internet users, conduct business globally, and access a vast array of services and resources from on-line. Recent reports show that the Internet has more than 200 million users worldwide, and that number is growing rapidly.
Subsequently, the demands of this incessant growth require an even greater need for ways to maximize the user experience. Internet Service Providers (ISPs), search engines, and high volume websites all have to deal with a growing number of users and rapidly increasing numbers of requests. System administrators grappling with these demands typically respond by purchasing a larger server, but even the most powerful and expensive server can eventually fail or become overloaded. Another option is to create a network server cluster, which consists of a group of servers configured to a common IP address, to handle heavy user traffic. To effectively handle traffic of this nature, it is necessary to employ a methodology known as load balancing to distribute the traffic evenly across the group, or cluster, of commonly addressed machines to which the user is trying to access. In this way, when one machine is handling multiple user requests, new requests are forwarded to another server with more capacity. There are various types of load balancing systems, which include hardware-based solutions from vendors such as Coyote Point Systems and Foundry Networks. There are also software-based solutions such as IBM's eNetwork Dispatcher and Microsoft's Network Load Balancing (NLB) that reside directly on a machine within a network cluster.
To be effective, load balancing must occur within a cluster transparently to the client, and without jeopardizing the client's connection. Conventional load balancing systems utilize various methods, procedures or configuration rules to distribute client traffic effectively throughout the cluster. One such method is known as the Affinity Mode of operation, in which client requests are distributed according to an affinity mode selected by the network administrator of the cluster. In “no affinity” mode, a connection request is distributed amongst the cluster nodes according to the client's source IP address and source port information. In “single affinity” mode, requests are distributed according to only the source IP address. This affinity is based on information contained within an IP packet that is sent by the client in accordance with the Transmission Control Protocol (TCP) or User Datagram Protocol (UDP). Ownership of a particular IP packet is based on the results of a hash algorithm performed over fields determined by the affinity mode being used. The hash value is used to compute which node should handle the request. These current load-balancing schemes enable IP packets to be intelligently distributed to specific nodes within the cluster.
However, this intelligence is not without its limitations. Under the present scheme, some network load balancing systems are not able to determine whether one or more connections that are started by the same application, such as a Web Browser, are related. In other words, there is no common identifier between multiple connections started by the same client application. This could result in connections being balanced improperly to a different cluster node. As an example of this, consider a scenario where an online shopper establishes a connection to fill an online shopping cart from a Web Browser application. Assume further that the shopping cart is managed by a server that is a member of a load balancing cluster. The packet transmitted by the client to establish the connection would specify an IP address that was assigned by the Internet Service Provider (ISP) from its pool of addresses. If for some reason the shopper were to leave the Web Browser open for a considerable amount of time, the connection that the shopper has to the online shopping cart could be terminated. If the shopper were to return after the termination period and attempt to add items to the original shopping cart, a new connection would be established. As a result, this connection may not be directed to the cluster node that held the original shopping cart items because the new connection might be assigned a different IP address by the client's ISP. This would result in the establishment of a new shopping cart, or session, on a different node. Thus, the previous shopping cart state may be lost because they are not identifiable by the cluster as being related to the user's most recent connection.
This same problem could occur in situations where multiple clients access the network, such as the Internet, through a proxy service. This type of service or device intercepts packets transmitted and received by clients that are members of a common network, such as in a corporate intranet, and then directs the packets to the appropriate destination or source IP address on behalf of the client. Similar to the situation described above, when a client behind a proxy transmits a packet to a destination IP address, the packet is assigned the proxy IP address. If a cluster receives this packet, the cluster can only identify the proxy IP address and not the address of the client that transmitted the packet. This causes a particular problem in situations where multiple client connections related to a single session, such as when accessing a shopping cart, end up being managed by different proxies. Even though the connections are related, the different proxies would assign them to different IP addresses. A destination cluster that receives these connections could then potentially load balance the connections to different nodes based on the different addresses. Currently, most load balancing systems have no easy way of grouping or identifying connections that are all related to the same client application or initiated during the same session.
Another load balancing problem occurs when IP packets are sent from multiple clients connecting from an ISP to a server cluster. As mentioned before, all clients of an ISP share a common pool of IP addresses. When requests are sent from multiple clients at various times, the request packets may be assigned the same IP address. This could be because the client requests are intercepted by a proxy or may come through a NAT (Network Address Translation) box. If the destination cluster performs load balancing based solely on the shared IP address as in single affinity mode, all requests from the multiple clients could be distributed to one node within the destination cluster, even though the requests may belong to different users. This may result in improper load balancing in the cluster, as that one node would be overloaded with all of the client requests. As an example, consider a scenario where multiple clients are attempting to access www.foobar.com through an Internet Service Provider having a pool of addresses. When a client enters the URL www.foobar.com into their Web Browser application, a TCP packet that specifies the address of the ISP as the source IP address is transmitted to the foobar Web Server cluster that contains the Web page information for www.foobar.com. The foobar cluster, upon receiving the TCP packet, will load balance the packet to a particular node based on the source IP address. Because the foobar cluster sees the same IP address for different users serviced by the ISP they all get directed to the same node in the foobar cluster in accordance with the single affinity mode of operation. The foobar cluster in this case would treat all of the requests coming from the ISP and assigned the same IP as a single client, when in fact the requests could be from multiple clients sharing the same IP address. The end result is improper load balancing within the foobar cluster.
A similar load-balancing problem occurs when a destination cluster attempts to communicate with a source cluster. According to some load-balancing schemes, packets sent by the destination cluster in response to requests received from the source cluster would be directed to the source VIP address, and not directly to the client that transmitted the request. This is because the request packets sent from the source cluster would all specify the source VIP address, and not the individual address of the sending node. Thus, the receiving destination cluster member would have no way of responding directly to the node within the source cluster that generated the request. Because there is currently no way for load balancing systems to specify that a response packet belongs to a particular node, the response could be load balanced to the wrong node once received by the source cluster.
The limitations discussed above apply directly to load balancing systems that utilize the “single affinity” mode of operation, in which client requests are distributed according to only a source IP address. However, there also exists a limitation within the “no affinity” mode of operation, particularly in the ability of load balancing systems to properly distribute related connections that come from the same client IP address, but from different ports. As an example, consider a scenario in which a client attempts to access a file from an FTP server cluster. Often times, FTP connections involve the downloading or uploading of large files between the client and the server, which could take considerable amounts of time depending on the size of the file. In order to speed up this process, a client can establish multiple connections to download the file. Some of these related connections could be established through a different port, or pipeline, than other connections, and would therefore be established according to a different port number. In this way, the file could be downloaded from the FTP server much more quickly than if it were being accessed through a single pipeline. However, if the FTP server cluster that receives the client request is in the no affinity mode of operation, the FTP connections could end up being load balanced to different nodes within the cluster due to the differing source port numbers of the received packets. Even though the request packets sent by the client would all specify the same client IP address and are all related to the same FTP transaction, the requests having different ports would be treated as separate connections. Obviously, this problem limits a client's ability to properly access their desired file.