The approaches described in this section could be pursued, but are not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
In a client-server environment, clients send requests for services and information to servers located on a network. The servers are often grouped into clusters so that large numbers of clients can access data and services without overloading any single server. Server load balancers are placed between the clients and servers, such that a load balancer receives requests from clients and distributes the requests across a server cluster.
The network shown in FIG. 1 illustrates such an environment. In FIG. 1, Server Load Balancer (SLB) 140 accepts requests sent from clients 162, 164, 166, through network 150, to a “virtual server.” The term “virtual server” is used to describe the Internet Protocol (IP) address to which a client connects in order to access resources and services from a server cluster. A virtual server is a single server instance that acts as a middleman between clients and the actual physical servers in the server cluster represented by the virtual server.
In FIG. 1, server cluster 110 includes a group of “real servers” 112, 114, 116, 118. Real servers are the physical devices that provide the services. The addition of new real servers and the removal or failure of existing real servers in the server cluster can occur at any time without affecting the availability of the virtual server.
SLB 140 selects a particular real server in server cluster 110 for processing a request. Typically, when a client 162, 164 or 166 initiates a connection to the virtual server representing server cluster 110, SLB 140 chooses a real server 112, 114, 116 or 118 for the connection based on a load balancing algorithm.
“Load balancing” refers to techniques used to distribute incoming service requests evenly among multiple servers such that the load distribution is transparent to users. Various load balancing algorithms and techniques can be used to determine which specific server should handle which client requests. As is known to those skilled in the art, there are many types of load balancing algorithms that can direct traffic to individual servers in a server cluster. For example, a round robin, weighted round robin, or weighted least connections algorithm may be used.
There are many factors that can be considered by a load balancing algorithm when selecting a real server to process a request. In particular, the “health” of a server can be considered as a factor in the load balancing analysis. For example, if a server is known to be currently unresponsive, it may be taken out of consideration as an available server in the load balancing analysis and subsequent real server selection. To determine the health of a server, out-of-band probes that attempt to connect to specific protocol destination ports have been used. The response (or lack of response) from these probes can be used to determine if the server is healthy or not. For example, an Internet Control Message Protocol (ICMP) probe may be made to determine if a server is responding normally.
Feedback tests, such as out-of-band health probes, can be proactively initiated. These types of probes may be configured to run at specific intervals, for example. If a probe does not receive a response from a server, or receives an abnormal response, the server is considered failed, and can be removed from the server cluster and from load balancing consideration.
Although configuration of a small number of out-of-band probes should have minimal impact on the performance of the system, such probes do not scale well. If the processor handling the probes is already running close to maximum capacity, a linear increase in the number of probes could lead to a greater than linear increase in processing burden. In addition, out-of-band probes are also limited by the fact that the speed in which they can detect failure is bounded by the probe frequency.
Therefore, there is a need for alternative techniques for monitoring server health that do not rely on out-of-band probes, and do not increase the processing burden on a server cluster.