Servers perform various tasks for clients such as furnishing web pages, hosting content, storing data and hosting applications. It was known to arrange two or more similar servers in a cluster for redundancy; in case one server failed, there would still be one server to handle the work requests of the clients. It was also known to arrange similar servers in a cluster to increase the overall computing resources available to the clients. In both cases, a load balancing computer, separate from the servers in the cluster, determined which server to route each new client request. The load balancing computer was located between the clients and the cluster of servers. Different load balancing algorithms are currently known. These include a simple round robin algorithm where the different servers in the cluster are sequentially assigned work requests as they are received by the load balancer. Other load balancing algorithms are more sophisticated and consider the length of the current work queue of each server and/or the current response time of each server. In these more sophisticated algorithms, new client requests are routed to the server with the shortest current work queue or shortest current response time. In those clusters where the servers are not all identical, a known load balancer may consider the type of client request and the server with the best configuration or resources to handle the type of client request.
It was also known for the servers in the cluster to periodically send to the load balancer a “hello” message or “heartbeat” to indicate that the server is still running and not severely overloaded. If the load balancer does not receive the hello message within a predetermined time interval (or “time-out”) of the previous hello message, then the load balancer may assume that the server is “down” and then remove the server from the cluster. In such a case, the load balancer will not assign any new client requests to this server until the server reestablishes its viability with the load balancer.
It was also known for the load balancer to monitor the work load of the cluster and of individual servers in the cluster. The work load was measured by the number of message packets received in the client requests, over a time interval. The length of the message packets in the client requests is loosely correlated to the work required to handle the request. If the work load exceeded a threshold for the cluster or any server in the cluster, then the load balancer may request help from additional resources on the network by sending the request to servers that are configured to lend available resources. Upon a successful request for additional resources, this would start the process as defined in the Join-Request Process.
Currently, the parameters defining the clustering, i.e. which servers are in the cluster to receive and handle client requests, what load balancing algorithm to use for the cluster, what hello message interval to use for each server, and what time-out to use for each server, are specified in the configuration of the load balancer. Some of these parameters are default parameters of the load balancing program, and others are input by an administrator of the load balancer. (Although as noted above, if a load balancer was initially configured to include a certain server as part of the cluster, and this server goes down, then the load balancer will remove this server from the cluster, until such time as the server reestablishes its viability to the load balancer.) While the foregoing technique to define the parameters was effective, it relied too heavily on the knowledge and foresight of the administrator of the load balancer. Also, it some cases, it was not sufficiently dynamic enough to account for change of circumstances.
Therefore, an object of the present invention is to improve the process of defining parameters of a cluster of servers.