Many modern computing applications are implemented in a distributed and layered fashion, with numerous worker nodes (often forming a back-end layer which is not directly accessible to application users for security and other reasons) responsible for executing the application's business logic. In many cases, the workload level of a distributed application may vary substantially over time, with new worker nodes being added or removed as needed based on various scaling policies of the application and/or the network-accessible services being employed by the application. Often, the workload may comprise units of work which can be performed largely independently of one another by individual worker nodes. In some cloud computing environments, hundreds or thousands of worker nodes may be configured for a given application.
For many such applications, front-end load balancers may be set up for receiving work requests and distributing the corresponding work units equitably among the back-end worker nodes. For very large applications, a group of load balancer may sometimes be employed, whose membership may also be adjusted from time to time based on the changing workload trends of the application. In such scenarios, at least some of the worker nodes to which a load balancer assigns work requests may also be the destinations of work requests sent by other load balancers. In order to make intelligent decisions regarding assignment of work units to worker nodes, e.g., so as to avoid overloading any particular worker node and to even out resource utilization levels as much as possible, a load balancer may use information about the current workload levels of the available worker nodes. However, especially in large dynamically-changing distributed environments in which network messages used to convey workload information can be delayed or lost, and in which worker nodes and other application components may leave or exit at random times, the workload information available to a given load balancer may often be incomplete or inexact.
While embodiments are described herein by way of example for several embodiments and illustrative drawings, those skilled in the art will recognize that embodiments are not limited to the embodiments or drawings described. It should be understood, that the drawings and detailed description thereto are not intended to limit embodiments to the particular form disclosed, but on the contrary, the intention is to cover all modifications, equivalents and alternatives falling within the spirit and scope as defined by the appended claims. The headings used herein are for organizational purposes only and are not meant to be used to limit the scope of the description or the claims. As used throughout this application, the word “may” is used in a permissive sense (i.e., meaning having the potential to), rather than the mandatory sense (i.e., meaning must). Similarly, the words “include,” “including,” and “includes” mean including, but not limited to. When used in the claims, the term “or” is used as an inclusive or and not as an exclusive or. For example, the phrase “at least one of x, y, or z” means any one of x, y, and z, as well as any combination thereof.