A load balancer distributes load across a collection of processing resources, such as, for example, computers configured to perform computing tasks such as data processing tasks, communication/networking tasks and/or data storage tasks. Example loads processed by the processing resources may include service requests (also referred to as “processing requests”) for causing one or more computing tasks to be performed by a processing resource. These service requests can include, by way of example and without limitation, requests to write data (e.g., a social media post, write to storage), requests to read data (e.g., accessing a social media post, requesting a timeline from a social media service, read from storage), search requests, compute requests, data download/upload requests, data display requests and the like. In some example embodiments, the “load” may include a volume of data from/to storage and/or volume of network traffic.
Load balancing is an important consideration in any processing system, and helps ensure the performance, scalability, and resilience of high transaction volume processing systems that have multiple processing resources. When processing of service requests can be distributed over multiple servers in a system, a load balancer may operate to control the distribution of the service requests across the multiple servers in order to reduce latency and/or increase the proportion of successfully serviced requests.
The various types of processing resources to which the load is distributed are sometimes collectively referred to as “servers” in this disclosure. Various techniques and algorithms have been proposed for load balancing among a set of servers. These techniques include, for example, round robin load balancing, and least loaded load balancing.
However, when a set of clients use these conventional load distribution techniques to distribute load to a plurality of servers that perform computing tasks in response to the received load, the overhead for establishing and maintaining connections between each of the clients and the set of servers can be high.
“Deterministic subsetting” enables each client to, be configured to maintain connections to only a subset (also referred to as an “aperture”) of the servers to which it sends load such as service requests. With deterministic subsetting (“deterministic aperture”) load balancing, a client is not required to establish connections with every server in a large set of servers that services a particular type of service request, and instead is only required to send its load over a smaller number of servers corresponding to the subset of servers with which the client establishes connections.