In computer networking, load balancing is commonly used to distribute workload across two or more resources. For example, a group of servers, storage devices, or other resources may collectively provide services to a number of clients. Each client that connects to the group of resources may be initially allocated to one of the resources.
Hashing-based server load balancing is widely employed. Each incoming client may be associated with a key that is mapped, through a hashing function, to the resource that is to be used by the client. Ideally, a high performance load balancing scheme should be able to achieve the following goals simultaneously, without compromising another goal: (1) distribute client requests to resources evenly; (2) minimize traffic interruption even in the event of resource removal or addition; (3) handle a weight associated with each resource (e.g., a first server resource may have twice the capacity as a second server resource, and the first server resource should thus be assigned twice the weight as the first); and (4) optimize the hash lookup operation to reduce operation cost, as speed is a major concern for many load balancing applications.
Existing hashing-based load balancing techniques include the known “consistent hashing” and “distributed hashing” techniques. Both of these techniques may support dynamic joining and removing of resources. The consistent hashing technique, however, can have a worst-case lookup cost of O(n), where n represents the number of resources or modules. The worst-case lookup cost may be experienced when most of the resources fail. The distributed hashing technique may not support weighted hashing and may have an average lookup cost of O(n) or, in some variations, O(log(n)).