1. Statement of the Technical Field
The present invention relates to workload distribution in network computing and more particularly to weighted load balancing in a computer communications network.
2. Description of the Related Art
In the prototypical distributed system, content or logic services can be delivered from an origin server to a community of consuming clients. Services and content typically can be delivered according to a request-response paradigm in which the consuming clients initiate a request for services to which one or more origin servers can respond with the requested services. To accommodate a high volume of requests and the workloads resulting from those requests, groups of servers have been clustered together to act as a unified server to external clients. In this regard, any given request could be handled by any of several servers, thereby improving scalability and fault-tolerance.
The decision to route requests to different servers in a server farm can involve a variety of request routing methodologies. In particular, server selection methodologies can be selected in order to maximize throughput and minimize response latency. For instance, server load balancing oriented methodologies monitor server status and direct requests to lightly loaded servers. Notably, load balancing methodologies intend to distribute incoming workloads to achieve a strategic objective such as ensuring the high-availability of any one server in a cluster to support subsequent incoming workload requests.
Logic which distributes workloads within a network often do so according to “weights” which are applied to the homogenous end points in a cluster from among which load balancing algorithms select to handle incoming workloads. The weights describe the desirability of one endpoint over another, and often describe the ratio of workloads which should be routed to each endpoint in order to optimally load balance incoming workloads. For example, if a load balancer can select from among three homogeneous endpoints—A, B and C—and the weights assigned to these endpoints are 1, 2, and 3, respectively, then one-sixth of the workloads are to be routed to endpoint A, one third of the workloads are to be routed to endpoint B, and one half of the workloads are to be routed to endpoint C.
The weights of a set of endpoints can be applied by assigning a number of incoming workloads to each different endpoint such that the aggregate assignments satisfy the specified weights for the set of endpoints. To that end, the raw weighting of a set of endpoints can be amplified by a factor such that proportionally speaking, different endpoints can satisfy a different number of requests in a portion of total requests processed in the load balancer over a period of time. For instance, in a three endpoint cluster in which the raw weights are one, two and three respectively, the first ten workloads can be assigned to the first endpoint, the following twenty requests can be assigned to the second endpoint and the following thirty requests can be assigned to the third endpoint.
Load balancers often engage in a process known as “normalization”. Normalization involves the reduction in amplitude of the number of requests assigned to any given endpoint while maintaining the raw weights of endpoints in a cluster. Normalization can be important where the rate of incoming workloads varies and lags. In this circumstance, it is possible that one endpoint can be assigned a disproportionate number of workloads in round-robin fashion because an amplified pro-rate share of workloads had not yet been assigned to the endpoint causing the other endpoints not to receive any workload assignments. To remediate this condition, normalization can reduce the amplitude of workloads to be assigned to an endpoint proportionately as between the different workloads.
When normalizing weights, it can be important to preserve the ratios among the weights as much as possible. Otherwise, workloads will not be optimally distributed to the proper endpoints thereby thwarting the purpose of the load balancing exercise. It also can be important that the weights are lowered sufficiently so as to permit proper workload distribution when workload requests arrive at a varied and low rate. Nevertheless, modern load balancers cannot both normalize the weights of endpoints in a cluster while preserving optimal weights. Principally, modern load balancers cannot normalize the weights of endpoint in a cluster while preserving optimal weights because oftentimes the amplitude of workloads to be assigned to an endpoint cannot be reduced proportionately with the amplitudes of other workloads while maintaining an integer value for the weights. In consequence, modern load balancers must round up or round down thereby introducing inaccuracies in the optimal weighting scheme.