The expansive growth of the Internet has led to a significant transition in the way people communicate and exchange information within our society. Conventional communication tools such as handwritten letters, telephones, and fax machines have been gradually replaced as the primary means of information exchange due to the high availability and popularity of internet based tools such as e-mail messaging and the World Wide Web. Today, the Internet is a global system of computer networks connecting millions of users worldwide using a common addressing system and communications protocol known as TCP/IP. People and businesses around the world can use the Internet to retrieve information, correspond with other Internet users, conduct business globally, and access a vast array of services and resources from on-line. Recent reports show that the Internet has more than 200 million users worldwide, and that number is growing rapidly.
Subsequently, the demands of this incessant growth require an even greater need for ways to maximize the user experience. Internet Service Providers (ISPs), search engines, and high volume websites all have to deal with a growing number of users and rapidly increasing numbers of requests. System administrators grappling with these demands typically respond by purchasing a larger server, but even the most powerful and expensive server can eventually fail or become overloaded. Another option is to create a network server cluster, which consists of a group of servers configured to a common IP address, to handle heavy user traffic. To effectively handle traffic of this nature, it is necessary to employ a methodology known as load balancing to distribute the traffic evenly across the group, or cluster, of commonly addressed machines that the user is trying to access. There are various types of load balancing systems, which include hardware-based solutions from vendors such as Coyote Point Systems and Foundry Networks. There are also software-based solutions such as IBM's eNetwork Dispatcher and Microsoft's Network Load Balancing (NLB) that reside directly on a machine within a network cluster.
To be effective, load balancing must occur within a cluster transparently to the client, and without jeopardizing the client's connection. Conventional load balancing systems utilize various methods, procedures or configuration rules to distribute client traffic effectively throughout the cluster. One such method is known as the Affinity Mode of operation, in which client requests are distributed according to an affinity mode selected by the network administrator of the cluster. In “no affinity” mode, a connection request is distributed amongst the cluster nodes according to the client's source IP address and source port information. In “single affinity” mode, requests are distributed according to only the source IP address. This affinity information is contained within an IP packet that is sent by the client in accordance with the Transmission Control Protocol (TCP) or User Datagram Protocol (UDP). Ownership of a particular IP packet is based on the results of a hash algorithm, in which the affinity information is used to compute which node should handle the request. These current load-balancing schemes enable IP packets to be intelligently distributed to specific nodes within the cluster.
However, this intelligence is not without its limitations. Under the present scheme, some load balancing systems are unable to properly load balance client connections that are related to the same client/server transaction, or session, when those connections are managed by a proxy service. This type of service or device intercepts packets transmitted and received by clients that are members of a common network, such as in a corporate intranet, and then directs the packets to the appropriate destination or source IP address on behalf of the client. Thus, it is an intermediary device that sits in-between the client and the server. When a client behind a proxy transmits a packet to a destination IP address, the packet is assigned the IP address of the proxy device as its source IP address. When this packet is received by a load balancing cluster, the cluster performs load balancing according to the specified source IP address contained within the packet (and optionally the source port). Because the source IP address is that of the proxy however, the cluster can only identify the proxy IP address and not the address of the client that transmitted the packet. Resultantly, the cluster is unable to relate the packet to a particular client or transaction. There are two distinct instances in which this phenomenon can result.
The first instance occurs in situations where multiple client connections are related to a single client session, such as when a client creates multiple connections to perform an e-commerce transaction. In this case, the different connections can end up being managed by different proxies. Even though the connections are related to the same session, different proxies assign the connections to their own respective proxy IP addresses, resulting in related connections having different source IP addresses. A destination cluster that receives these connections can erroneously load balance the connections to different nodes based on the different source IP addresses (single affinity mode), despite the fact that the connections are related.
The second instance occurs in situations where a single proxy manages a large number of clients. As before, the proxy intercepts all packets generated by the various clients before they are transmitted to the destination IP address. Once intercepted, the proxy assigns its own IP address as the source IP address of the packet, and then directs the packet accordingly. When the destination IP address that the packet is directed to is that of a load balancing cluster that distributes client traffic according to the source IP address (as in single affinity mode of operation), all requests from the multiple clients are distributed to a single node within the destination cluster, even though the requests may belong to different clients. This is obviously not the desired functionality of a load balancing system, as this causes the single recipient node to become overloaded, and could further result in decreased performance of the entire cluster network system. Ideally, the different clients should be distributed to different nodes within the cluster for faster processing and efficient traffic management.
Hardware/firmware load balancing solutions that use a central box as a traffic cop or proxy (CBLB—central box load balancers) can deal with the above stated issues because the load balancer in the box can act as an application level proxy. In other words, CBLBs can determine the session binding of multiple client connections through one or more fields in the session/application layer header of the received packet and then keep these connections together when relaying them to the end server node. The field used to determine the session binding could be a cookie or a URL (Uniform Resource Locator) in the case of HTTP connections, or some other field in the session/application layer header relative to the particular task initiated during the session. CBLBs allow incoming packets to be associated with a particular client session (grouped) before the packet is distributed to the end node.
Unfortunately, software distributed load balancers, unlike the central box load balancers, cannot determine the grouping of the connections prior to the connection being formed with an end node. As a result, most software-based load balancing solutions mimic the CBLB by employing a centralized dispatcher model of distribution. U.S. Pat. No. 5,774,660 by Brendel et al. provides a clear example of this model of traffic distribution. As disclosed by the patent, a dedicated node acts as a load balancer or traffic cop that receives all incoming packets to the cluster. The load balancer then determines how the incoming packets are to be distributed, and dispatches the connections to the other nodes within the cluster. This type of operation however, limits the traffic throughput of the system by introducing an additional node (the dispatcher node) between the client and the desired end node. The dispatcher node is always present to receive incoming client packets, even after the end node is determined and the connection is dispatched. Furthermore, the system disclosed by Brendel et al. requires that each server node within the cluster have a different set of resources. However, this requirement can cause the load balancing system to suffer performance drawbacks in situations where a resource (e.g. Web server, custom application, e-mail server) on one of the server nodes is in high demand. Numerous requests for a particular resource residing on a single node can result in overloading.
Suffice to say that in distributed software load balancing solutions, there is no convenient means of ensuring that all connections of a session are handled by the same node, or that connections of different sessions get load balanced to different nodes without incurring the extra overhead of a middleman (e.g. the dispatcher node).