The invention relates to the field of data packet management. More specifically, the invention relates to the regulating data packets to and from a cluster or group of data servers.
The evolution over the past 20 years of digital communications technology has resulted in the current distributed client-server data networks, the most well known of which is the Internet. In these distributed client-server networks, multiple clients are able to access and share data stored on servers located at various points or nodes through a given network. In the case of the Internet, a client computer is able to access data stored on a server located at any point on the planet, as long as the server is also connected to the Internet.
With the rapid proliferation in use of distributed data networks such as the Internet, more and more clients from around the world are attempting to connect to and extract data stored on a finite number of servers. Those establishing and maintaining the servers containing the desired data, such as web pages from popular web sites, are finding it difficult to insure that all the clients attempting to access data will be able to do so. A given server can only connect with a finite number of clients at the same time. The number of simultaneous connections a given server can support is a function of the server""s computational, storage and communications capabilities. In situations where the number of clients attempting to access data stored on a server exceeds the server""s capacity, some clients either will not be able to connect or will be dropped by the server. In other cases where a server is overwhelmed by client requests for data, the server may cease to function altogether.
As a partial solution to the situation described above, server operators will typically deploy multiple mirrored servers, each having data identical to that stored on all the other servers. The mirrored servers are typically connected to the same network and are referred to as a server cluster. In conjunction with the multiple mirrored servers, a load balancer is typically used. When a client attempts to connect to and access data from a server cluster, the client""s request is first received by the load balancer which determines which of the servers is best suited to handle the client""s request. Various load balancing solutions are commercially available and each uses different techniques and criteria for determining to which server to direct a client""s request. However, the common characteristic for each of the currently available solutions is that they all attempt to pick a server which is most capable of responding to the client""s request.
An inherent drawback with the load balancing solutions available today is that they all require a device generally referred to as a load balancer. As described above, the load balancer is the first point of contact for each client attempting to access data on the server cluster, and therefore, the maximum rate at which the entire server cluster can receive and respond to client requests is limited by the throughput of the load balancer. It is foreseeable that the number of client requests for data may exceed a load balancer""s ability to properly route the requests, and requests will be ignored or dropped despite the fact that the server cluster has sufficient capacity to handle all the request. Another drawback of the currently available load balancers is that when they malfunction, their entire server cluster becomes inoperative.
U.S. Pat. No. 6,006,259 relates to an Internet Protocol (IP) network clustering system wherein it attempts to address the above-mentioned drawbacks by distributing the load balancing responsibilities to all of the servers in the server cluster. However, it does not address the problem of balancing the load between non-uniformly loaded servers in the server cluster. That is, the system considers the available capacity (or the current load) of the server cluster as a group, but not the available capacity of the individual servers in the server cluster. For example, if one server is operating at 90% capacity and another server is operating at 60% capacity, it is desirable that the load balancing system directs more traffic to the lightly loaded server. The present invention advantageously balances load between the servers in the server cluster by directing traffic based on the available capacity of the individual servers in the server cluster.
Therefore, it is an object of the present to overcome the disadvantages of the above-described systems by providing a system and technique for balancing or distributing load between the servers in a server cluster. The present invention is a network load balancing system which is highly scalable and optimizes packet throughput by dynamically distributing the load between the servers in a server cluster.
The present invention includes a server cluster comprising a plurality of servers, with all servers having the same network address, and each server having a load balancing module to generate a connection value for each connection request. A particular server in the server cluster accepts and processes the network connection request based on the connection value. That is, each server is associated with a non-overlapping range of connection values and accepts only connection requests having connection values within that range. The range associated with each server is dynamically adjusted based on the available capacity of each server in the server cluster to thereby dynamically balance the load between the servers.
The servers are connected to the network in parallel such that each server receives every connection request, such as a synchronizing segment or packet (SYN) for transmission control protocol/internet protocol (TCP/IP) connection, substantially at the same time. The SYN packet is the first segment or packet sent by the TCP protocol and is used to synchronize the two ends of a connection in preparation for opening a connection. The load balancing modules on the servers communicate with each other to determine each server""s relative ability to accept a new connection (i.e., available capacity).
Various other objects of the present invention will become readily apparent from the ensuing detailed description of the drawings.