The approaches described in this section could be pursued but are not necessarily approaches that have previously been conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
In a typical load balancing scenario, a service hosted by a group of servers is front-ended by a load balancer (LB) (also referred to herein as a LB device) which represents this service to clients as a virtual service. Clients needing the service can address their packets to the virtual service using a virtual Internet Protocol (IP) address and a virtual port. For example, www.example.com:80 is a service that is being load balanced and there is a group of servers that host this service. An LB can be configured with a virtual IP (VIP) e.g. 100.100.100.1 and virtual port (VPort) e.g. Port 80, which, in turn, are mapped to the IP addresses and port numbers of the servers handling this service. The Domain Name Service (DNS) server handling this domain can be configured to send packets to the VIP and VPort associated with this LB.
The LB will inspect incoming packets and based on the policies/algorithms will choose a particular server from the group of servers, modify the packet if necessary and forward the packet towards the server. On the way back from the server (optional), the LB will get the packet, modify the packet if necessary and forward the packet back towards the client.
There is often a need to scale up or scale down the LB service. For example, the LB service may need to be scaled up or down based on time of the day e.g. days vs. nights, weekdays vs. weekends. For example, fixed-interval software updates may result in predictable network congestions and, therefore, the LB service may need to be scaled up to handle the flash crowd phenomenon and scaled down subsequently. The popularity of the service may necessitate the need to scale up the service. These situations can be handled within the LB when the performance characteristics of the LB device can handle the scaling adjustments needed.
However, in many cases the performance needs to be increased to beyond what a single load balancing device can handle. Typical approaches for this include physical chassis-based solutions, where cards can be inserted and removed to handle the service requirements. These approaches have many disadvantages which include the need to pre-provision space, power, and price for a chassis for future needs. Additionally, a single chassis can only scale up to the maximum capacity of its cards. To cure this deficiency, one can attempt to stack LB devices and send traffic between the devices as needed. However, this approach may also have disadvantages such as the link between the devices becoming the bottleneck, and increased latencies as packets have to traverse multiple LBs to reach the entity that will eventually handle the requests.
Another existing solution is to add multiple LB devices, create individual VIPs on each device for the same servers in the backend and use the DNS to distribute the load among them. When another LB needs to be added, another entry is added to the DNS database. When an LB needs to be removed, the corresponding entry is removed from the DNS database. However, this approach has the following issues. DNS records are cached and hence addition/removal of LBs may take time before they are effective. This is especially problematic when an LB is removed as data directed to the LB can be lost. The distribution across the LBs is very coarse and not traffic aware e.g. one LB may be overwhelmed while other LBs may be idle, some clients may be heavier users and end up sending requests to the same LB, and so forth. The distribution between LBs may not be LB capacity aware e.g. LB1 may be a much more powerful device than LB2. Thus, the existing solutions to solve this problem all have their disadvantages.