In communication networks, such as the Internet, a stateful scale-out network service (SSNS) refers to a service that can be scaled by adding or removing individual service instances to meet changes in traffic demand. For example, a network service may be implemented by multiple servers and the service load may be distributed among the multiple servers. When demand for the service increases, a new server can be added to handle the increased demand without having to reconfigure existing servers. Similarly, when demand decreases, servers can be removed or re-allocated to other uses without reconfiguring the remaining servers. Such events are referred to as “scaling events”.
Load balancing is used to distribute service loads across multiple service instances as evenly as possible. The load balancing mechanism should be deterministic, meaning that it directs all packets belonging to the same network flow to the same service instance. The load balancing mechanism also needs to adapt to scaling events. When a scaling event occurs, the load balancing mechanism should adapt its distribution rules to redistribute the load.
In many cases the service instances are stateful and independent, meaning that each service instance maintains some form of state information for each individual network flow, and that this state information is not shared between different service instances. When distributing traffic for stateful services, the load balancing mechanism should ensure that packets belonging to network flows that were active prior to the scaling event will still be forwarded to the same service instance after the scaling event. This forwarding property is referred to as “flow affinity.”
Dedicated load balancing appliances offered by some companies are able to maintain flow affinity while being transparent to end users (i.e. no DNS redirection), but they achieve this goal by maintaining state information for every flow. Because many of these appliances act as proxies, the amount of per-flow state information that must be maintained can be substantial. Due to the need to maintain a large amount of state information for every flow, load balancing appliances either fail to scale to large numbers of flows, or they require a large amount of memory and processing power, making them expensive to implement.
Another load balancing approach is described in OpenFlow-Based Server Load Balancing Gone Wild. This approach, based on the OpenFlow protocol, requires that per-flow state information be maintained only during a predetermined period of time after a scaling event. However, this technique requires an understanding of network protocol semantics to differentiate new flows from existing ones in order to maintain flow affinity after scaling events. This approach does not scale easily due to its reliance on a centralized controller to examine the first packet of every flow.
A third load balancing technique for stateful services is to distribute flows using hash-based techniques as described in RFC 2992 and used in Equal Cost Multi-Path (ECMP) routing. This approach provides deterministic scale-out but does not maintain flow affinity in response to scaling events. A related approach is to use wildcard rules based on the IP 5-tuple (src ip, dst ip, src port, dst port, protocol) and to divide the tuple space into N regions, where N is the number of service instances. Different tuple spaces can be used (source and destination IP address, source IP, etc.). However, this approach also fails to maintain flow affinity in response to scaling events.
U.S. Pat. No. 7,877,515 B2 titled Identity Assignment for Software Components describes different ways of assigning, distributing, and re-distributing functional (logical) addresses to service instances, but does not address the flow affinity problem. According to the '515 patent, all flows, new and existing, are distributed across service instances based on logical addresses, which can be re-distributed during a scaling event. It is up to the service instances themselves to redirect ongoing flows to their original service instance if flow affinity needs to be maintained.
Accordingly, there remains a need for a new approach to load balancing that maintains flow affinity after a scaling event without the need to maintain state information for each network flow.