1. Field of the Invention
The present invention relates to load balancing traffic among a plurality of servers, and in particular, to a least connections load balancing method.
2. Background Information
Computer networks typically use file servers which frequently operate under a client-server paradigm. Under this model, multiple clients can make input/output (I/O) requests which are directed to a particular resource on the network. A server on the network receives and carries out the I/O requests. When the server receives multiple I/O requests, the server may choose to service them one at a time. I/O requests which are not being processed typically wait until the server is ready to receive more requests. As a result, the server can become a bottleneck in the network.
Typically, it is desirable to distribute various client requests among the plurality of servers. In these instances, it requires collaboration as to how the various client requests are to be distributed among those various servers. This may be performed through load balancing. Server load balancing allows for a group of real servers (a server farm) to be represented as a single virtual server entity wherein the traffic is balanced among the plurality of servers. One method to obtain server load balancing is to use a round-robin method. With this method, a new connection between a client and a real server is performed by choosing the real servers in a circular manner wherein a connection is made if the chosen server has capacity to handle the connection. However, the round-robin method does not insure that the various real servers are indeed effectively load balanced.
Another method for server load balancing is a least connections method in which a new connection is assigned to the real server with the least number of currently active connections. Compared with the round-robin method, the least connections method provides for a more accurate load balancing of the servers; however, it is rather complex and consumes a fair amount of processing time of the device that is performing the load balancing. For instance, the method sends a new connection to a server which has the lowest metric, wherein the metric is defined as the number of connections on the server divided by the weight (or capacity) of the server. This metric is kept as a quotient/remainder pair. To keep track of the metric and the remainder, integer division is typically performed on all servers every time a connection is added or removed.
On a different note, it is desired to flexibly increase the number of real servers as demand increases for resources at the server farm. However, one aspect of this problem is that as greater numbers of real servers are added, the load balancing process slows down, which may require replacing the load balancing device with one that processes at a faster speed. For instance, consider a method in which the load balancing device sequentially tests a list of servers for least connections. One reason for the slowdown may be that as the list of potential servers for connections increase, more time is needed for the device to find the server with the lowest number of active connections.