A Client is defined as any combination of hardware and software (e.g., including operating systems and client applications) that is capable of accessing services over a network connection.
A Server is defined as any combination of hardware and software (e.g., operating systems and server applications) that is capable of providing services to clients.
A Blade is defined herein as any combination of hardware and software (e.g., operating systems and client and/or server applications software) which is capable of acting not only as a server but also as a client. A Blade Server is an instance of a server on a blade, whereas a Blade Client is an instance of a client on a blade. A blade can both be a client and a server at the same time. The terms blade server and server may be used interchangeably herein.
A Blade ID is a unique value that identifies a blade among other blades.
A Load Balancer is a network device that receives requests/packets coming from clients and distributes the requests/packets among the blade servers.
Server-Side Load Balancing is a technology whereby service requests are distributed among a pool of blade servers in a relatively transparent manner. Server-side load balancing may introduce advantages such as scalability, increased performance, and/or increased availability (e.g., in the event of a failure or failures).
As shown in FIG. 1, a system (i.e., Load Balancing Site) may include one or more load balancers and multiple blades, with the load balancer(s) providing coupling between the blades and outside clients/servers through a network (such as the Internet). The pool of blades may seem to be a single virtual server or client to the outside world (e.g., multiple blades at a load balancing site may use a same IP address for outside communications). Multiple load balancers (e.g., two in FIG. 1) can be used to provide resiliency/redundancy such that when one of the load balancers fail, the other load balancer takes over server load balancing. As discussed herein, a load balancing site may be realized in different ways. For example, a load balancing site can be a single network node (on a single or on multiple chassis) where the load balancers are realized using line cards of the network node, and the blade servers are realized using other service/server cards of the same network node. In this case, connections between the load balancers and the blades can be realized using a backplane of the network node. Alternatively, a load balancing site may be realized using multiple network nodes where a load balancer can be realized by a separate network node and may or may not be co-located with the blades/servers.
An Outside Node (e.g., an outside server and/or client) is defined as a network node which is located outside the load balancing site. An outside node can be a client requesting a service from one of the blade servers, or an outside node can be an outside server which is serving a blade client inside the load balancing site.
As used herein, a data flow is defined as network traffic made of up data packets and transmitted between a client and a server that can be identified by a set of attributes. Sample attributes may include 5 tuple parameters (e.g., Src/Dest IP addresses, Protocol, Src/Dest TCP/UDP Port), Src/Dest (Source/Destination) Mac address, or any other set of bits in the data packets (e.g., PCP bits, VLAN IDs, etc) of the data flow, or simply source and destination nodes of the network traffic. For example, over a certain link (e.g., from node a to node b) in a network, a packet passing through with a specific source IP address (e.g., IP1) is part of a flow identified by the source IP address over that link with the attributes (IP1, a, b). As another example, in an access network, traffic originated from a subscriber can also be considered as a flow where that flow can be identified as the traffic passing through a UNI/NNI/ANI port of a RG (Residential Gateway). Such subscriber flows in access and edge networks can be also identified by subscriber IP addresses. Upstream/downstream subscriber flow (e.g., flow from the subscriber/network side to the network side/subscriber) may have the IP address of the subscriber as the source/destination IP address respectively.
A flow ID is an ID or tag used to identify a flow. For example, the set of attributes used to identify a flow may be mapped to natural numbers to construct Flow IDs. Also, a Flow ID uniquely identifies a flow.
An incoming Flow is network traffic that enters the Load Balancing site and that originated from outside the Load Balancing site. An incoming flow includes not only data traffic that is destined to the load balancing site to be terminated at the load balancing site but also data traffic that is to be forwarded by the load balancing site after corresponding processing. An Incoming Packet is a packet belonging to an incoming flow.
Outgoing Flow is the network traffic that is about to leave the load balancing site. The outgoing flow includes not only the network traffic that is originated by the load balancing site (e.g., by the blade servers/clients) but also network traffic (originated by an outside node) that is forwarded by the load balancing site (after further processing at load balancer and/or the blade servers) to another location. An outgoing packet is a packet belonging to an outgoing flow.
Granularity of a flow refers to the extent to which a larger (coarse-grained) flow is divided into smaller (finer-grained) sub-flows. For example, an aggregate flow passing thorough a link (from node a to node b) with multiple destination IP addresses may have a coarser granularity than a sub-flow passing through the same link with a certain destination IP address. The former flow can be referred to as link flow and the latter flow can be referred to as link, destination IP flow.
A flow can be made up of many flows. Accordingly, the Flow ID that can be derived from a packet of an arbitrary incoming flow at the load balancing site may be a random variable. A probability distribution of the Flow ID may depend on what and how the packet header fields are associated with the Flow ID. (The header fields of an incoming packet that include the Flow ID can also be random variables, and mapping between the header fields and the Flow ID may govern the probability distribution of the Flow ID of incoming packets). For example, assuming that the Flow ID simply depends on a respective source IP address, then the probability distribution of the Flow ID of incoming packets will depend on factors such as how a DHCP server allocates the IP addresses, demographic distribution in case of correlation between geography and IP addresses, etc.
A connection is an example of a flow that can be identified using 5-tuple parameters (Src/Dest IP address, Protocol, Src/Dest TCP/UDP Port). TCP (Transmission Control Protocol) or UDP (User Datagram Protocol) connections can be considered as an example. As used herein, Src means source, Dest means destination, PCP means Priority Code Point, UNI means User Network Interface, NNI means Network Network Interface, ANI means Application Network Interface, and HA means High Availability.
A Type-1 Flow is a type of flow for which it is possible to detect the start of the flow or the first data packet of the flow by considering only the bits in the first packet of the flow (without consulting other information). For example, an initial data packet of a connection can be identified using a SYN (sequence) flag in TCP packets and an INIT (initial) flag in SCTP (Stream Control Transmission Protocol) packets. Many connection oriented protocols have a way of telling the server about the start of a new connection. For example, a subscriber level flow can be identified by subscriber IP address (e.g., Source/Destination IP address of the upstream/downstream traffic). In such a case, a RADIUS start request or DHCP (Dynamic Host Configuration Protocol) request may indicate the start of the subscriber level flow. Because the Flow ID (identification) is based on the source IP address, a new flow for a subscriber can be detected by sensing the RADIUS packet or DHCP packet which is generated to establish the subscriber session.
A Type-2 Flow is a flow that may be defined arbitrarily such that it may be difficult to determine initial packets of a flow by considering only packet headers.
Load Balancer Traffic Management
When client sends a request to a load balancing site, a load balancer of the load balancing site forwards the request to one of the available blade servers. Once the data flow is established, the load balancer is in charge of distributing subsequent data packets of the data flow to the appropriate blade server(s). In this case, the blade server may be the flow/connection end point where, for example, the corresponding TCP connection has an end point at the blade server.
In an alternative, one of the blade clients in the load balancing site may initiate a data flow to an outside node. In this latter case, load balancer may still be responsible for forwarding all the response packets of the connection to the original blade client.
In addition, an outside client node can originate/initiate a connection which is destined to an outside node but which needs to traverse the load balancing site for further processing. As an example, subscriber management nodes and/or nodes/sites for deep packet inspection can be considered. In such scenarios, it is possible that certain flows may need to be associated with specific blade servers so that the processing can be performed consistently. In other words, it is possible that all the data packets of some flows may need to travel to the same blade server during the life time of the flow.
In summary, regardless of the origin of a data flow, the traffic of the data flow may need to be forwarded by the load balancer in a convenient fashion.
Flow Aware Server Load Balancing: Maintaining the Flow Stickiness
In flow level load balancing, the load balancer first allocates a new flow to a blade. That is, the initial data packet of an incoming data flow (e.g., SYN packet of a TCP connection, INIT packet of an SCTP connection) is forwarded to an available blade server with respect to a scheduling mechanism (e.g., weighted round robin, etc). All of the subsequent data packets associated with the flow are then processed by the same blade. In other words, the flow ‘stickiness’ to a particular blade should be maintained by the load balancer.
Most transport protocols such as TCP and SCTP may require connection level load balancing such that data packets belonging to a same connection are handled by a same blade server. On the other hand, UDP can sometimes cope with packet level load balancing where each individual packet can be handled by a different blade server.
A requirement/goal of sending subsequent packets associated with a data flow to the previously assigned blade server may make load balancing more challenging. Such Load Balancing may be referred to as Flow Aware, Session-Aware, and/or Connection-Aware Load Balancing.
Load Balancers
As discussed in greater detail below, requirements/goals of load balancers may include: flexible, deterministic, and dynamic load distribution; hitless support of removal and addition of servers; simplicity; support for all traffic types; and/or Load Balancer HA.
An ideal load balancer may distribute the incoming traffic to the servers in a flexible, deterministic and dynamic manner. Determinism refers to the fact that the load on each server can be kept at a targeted/required level (throughput this specification uniformity and deterministic load balancing are used interchangeably) whereas dynamicity refers to the fact that as load indicators over the servers change over time, the load balancer should be able to dynamically change the load distribution accordingly to keep the targeted/required load levels (e.g., a lifetime of each data flow/connection may be arbitrary so that load distributions may change).
Flexibility in load distribution refers to the granularity of load balancing such that the load balancer should support per flow load balancing where the flow can be defined in a flexible manner such as 5-tuple connection level (e.g., relatively fine granular load balancing) or only source IP flows (e.g., coarser granular load balancing). Flow-level load balancing refers to the fact that all data packets of the same data flow may need to be sent over the same server/blade during the lifetime of a connection (this may also be called as flow-aware load balancing preserving the flow stickiness to the servers/blades).
Hitless Support of Removal and Addition of Servers: The load balancer should be able to support dynamicity of resources (e.g., servers) without service disruption to the existing flows. Dynamicity refers to planned/unplanned additions and/or removals of blades. That is, whenever the number of active blades changes, the load balancer should incorporate this change in a manner to reduce disruptions in existing data flows. (Support for dynamicity of resources may be a relatively trivial requirement as most of server pools operate in a high availability configuration. Moreover, graceful shutdowns as well as rolling upgrades may require planned removal(s)/restart(s) of blades/servers).
The Load Balancer should be as simple as possible. Simplicity is a goal/requirement which, for example, may provide TTM (Time To Market) and/or cost effectiveness advantages.
As a goal, the load balancer should support all kinds of traffic/protocols (i.e., the load balancer may be Traffic-type agnostic).
Load Balancer HA: In cases where load balancer level redundancy is provided, it may be desirable that no state/data replication is required on the back up load balancer for each flow, and the switch over to back up should take place rapidly and/or almost immediately in the event that the primary load balancer fails.
Table-based flow level server load balancing is considered as a stateful mechanism such that the scheduling decision of each data flow is maintained as a state in the load balancer.
Table-based flow level load balancing is a stateful approach which uses a look-up table at the load balancer to record previous load balancing decisions so that subsequent packets of an existing data flow follow a same blade server assignment decision. Accordingly, the load balancer may need to keep the state of each active flow in the form of a Flow ID (e.g., 5 tuple parameters for connections) to Blade ID (e.g., IP address of the blade) mapping table. The first packet of the flow is scheduled/assigned to a blade server by the load balancer with respect to a scheduling algorithm.
As shown in FIG. 2, a first packet of a new data flow (e.g., including SYN for TCP or including INIT for SCTP) is scheduled to an available blade server, the state of the connection is recorded in the look-up table of the load balancer, and the packet is sent to the scheduled blade server. For the subsequent data packets of the same connection, the scheduled blade ID is identified in the look-up table, and the data packet is sent to the identified blade server.
Using load balancing operations of FIG. 2, the load balancer communicates with the blade servers regarding their availability for server load balancing.
Stateful Load Balancing with Scheduling Offload
As shown in FIG. 3, the load balancer (LB) may only keep the Flow ID to Blade ID mapping table, with intelligence/control being provided at a separate controller. The scheme of FIG. 3 basically offloads scheduling tasks from the load balancer to the controller to allow potentially more sophisticated scheduling schemes to be used, depending on a capacity of the controller.
As a data packet arrives at the load balancer, the load balancer extracts the Flow ID and performs a table look up operation. If the load balancer finds a match with the Flow ID in the table, the data packet is forwarded to the indicated Blade (having the Blade ID matching the Flow ID as stored in the mapping table). If the data packet is from type 1 data flow, then the load balancer can use the packet header to identify the data packet as an initial data packet of a new flow without using the mapping table. Otherwise, if no match is found for the Flow ID in the mapping table, the load balancer can determine that the data packet is an initial packet belonging to a new data flow.
The load balancer then either sends the new packet to the controller or otherwise communicates with the controller regarding this new data flow. Responsive to this communication from the load balancer, the controller instructs the load balancer to add a Flow ID to Blade ID mapping entry to the mapping table, and the controller forwards the data packet to the corresponding Blade. Hereafter, all data packets belonging to the data flow will be forwarded by the load balancer to the corresponding Blade because the table now has a mapping entry for the data flow.
Accordingly, the controller is responsible for communicating with the blade servers for their availability and load, and the controller performs the scheduling and updates the load balancer mapping table. The load balancer in this case may be a dumb device which only receives the commands from the controller to add/delete/modify the mapping entries in the mapping table and perform data packet forwarding according to the mapping table.
Stateless Load Balancing
For stateless load balancing algorithms/operations, there may be no need to maintain any sort of state. By considering only a packet header, a scheduling decision can be made whether it is the first packet of the flow or not. In other words, no state is kept with respect to the Flow and Blade IDs to make the scheduling decision of the later packets of a flow.
Static Mapping: Hash-Based Flow-Aware Server Load Balancing
A Hash-based approach is a stateless scheme that maps/hashes any Flow ID (e.g., 5 tuple parameters) to a Blade ID. In that respect, hash based scheduling maintains flow stickiness as the Flow ID to Blade ID mapping is static.
As an example, in a load balancing site with N (e.g., N=10) blade servers and a single load balancer, a hash function may take the last byte of the source IP address (e.g., 192.168.1.122) of a packet as an integer (e.g., 122) and take modulo N. The resulting number (e.g., 2) is in the range of 0 to N−1 which points to the blade server to which the packet is to be forwarded. In the event of a server failure, the hash function (i.e., in this case the module N function) needs to be updated.
In a more sophisticated example, a set of fields in the packet header is first hashed/mapped to a large number of M buckets such that M>>N (N being the number of servers with the number of buckets M being much greater than the number of servers N). Then, a second level mapping from the M buckets to the N servers is performed. This two-level hashing/mapping based mechanism may provide an ability to cope with server failures, such that in the event of a failure, only second level (i.e., M bucket to N Server) mapping needs to be updated without any change in the original hash function.
A good hashing function may generate substantially uniformly distributed outputs. Weighted uniformity can also be achieved using hash functions. In other words, weights can be assigned to each blade server with respect to its capacity, and the distribution of the traffic may be expected to assume a (weighted) substantially uniform distribution with respect to capacity over the blade servers.
Non Static Mapping Schemes (Per Packet Server Load Balancing)
In some cases, a flow may be comprised of a single packet, and/or per packet server load balancing may be required. Transport protocols such as UDP (User Datagram Protocol) may tolerate such per packet server load balancing. In such cases, the load balancer may uniformly selects a blade server to schedule an incoming packet. By doing so, the Flow ID to Blade ID mapping is not necessarily maintained meaning that the same Flow ID may not necessarily map to the same Blade ID each time the scheduling algorithm (e.g., Random Blade Selection, Round Robin, Weighted Round Robin, etc.) is executed.
As mentioned above, alternatively if a data flow is only consists of a single data packet (which is both the first and the last packet of the flow), then even flow based stateful load balancing may becomes a non-static mapping scheme, because there is no need to keep a Flow-ID to blade ID table.
Protocol Specific Load Balancing (Stateless)
Some load balancing schemes have exploited the nature of protocol specific handshakes, acknowledgements, and/or other mechanisms to leverage flow aware load balancing in a stateless manner.
For protocols like GTP and SCTP, the information about the assigned blade can be embedded in the packet headers which can be used for flow stickiness as briefly explained below.
New Connection Assignment:
A new flow is identified by considering the packet header fields (e.g., an INIT flag of a SCTP packet). The new flow is then assigned to a blade server using a scheduling algorithm such as a Round Robin, Hash based method that may exploit the random nature of bits (if any) in the header, etc.
Maintaining Flow Stickiness:
For protocols like SCTP and GTP, information about the blade assigned to the first packet of a data flow can be embedded in the headers of the subsequent flow packets. For example, a V_tag field in SCTP packets can store information about the Blade ID of the assigned blade server. Once a flow has been identified as an existing one (e.g., reading SYN flag which is 0 for the subsequent packets of a SCTP connection), the information about the assigned blade server can be extracted from the packet header and the packets can then be forwarded to the correct blades.
Other Flow Aware Load Balancing Techniques
In DNS (Domain Name System) based server load balancing mechanisms, each server/blade has a unique IP address which is know by the DNS servers. When a client/outside node initiates a connection, the corresponding DNS request is sent to the DNS servers which choose one of the IP addresses of the blade servers and sends the response back. The client/outside node than directs the connection/flow towards the specific blade server. In other words, the destination IP address of the packets of the flow belongs to the blade server in question. The load balancer in this case performs a route lookup (e.g., a FIB lookup or Forwarding Information Base lookup based on the destination IP address) for all the flows to be forwarded to the correct blade server. No scheduling may need to be performed at the load balancer. Accordingly, various client requests are load balanced over the blade servers.
Table-Based Server Load Balancing (Stateful)
Centralized Scheduling and Table Look up at the Load Balancer
As discussed above, flow aware server side load balancing may require all the packets belonging to the same flow to be forwarded to the same blade server. For a table-based approach, a state for each flow (e.g., Flow ID to Blade ID mapping) existing on the blades may need to be maintained.
Table-based load balancing may be compatible with server load aware (dynamic) load balancing techniques. As an example, using weighted round robin scheduling in conjunction with table based server load balancing, weights can be changed for each blade server dynamically based on the load on each blade server. As the number of flows increases, however, the size of the table as well as the time it takes to search the table also increases. Also, the table search/lookup has to be performed for every incoming packet which may significantly increase processing overhead on the load balancer as the size of the table increases. With this approach, the load balancer may become more vulnerable to resource exhaustion (both cpu and memory resource exhaustion) in conditions of high traffic load.
Moreover, for every new flow/connection, the load balancer may need to perform scheduling operations and update the table accordingly. As a rate of new connections/flows increases, scale problems may arise because there is a single processing entity.
In addition, in standard deployments of a load balancing system, multiple load balancers may be deployed in parallel to provide increased availability in the event of a load balancer failure(s). In such deployments, flow replication mechanisms may be deployed to provide failover for active flows on a failed load balancer. Flow replication mechanisms may require all active flow state information to be replicated and distributed among all participating load balancers (i.e., synchronizing the table providing the flow ID to Blade ID mappings for the participating load balancers). Synchronization among the load balancers for such session state information may introduce significant communication and processing overhead.
In addition, the time it takes for a new flow (e.g., the first packet of a session) to be redirected to one of the load balancers until the other load balancer is ready for the failover for that session (called Peering Delay) can be very high in event of a high incoming flow rate. Peering delay is a known issue for resilient load balancing such that the state of the flows with lifetimes less than or equal to the peering delay would not be replicated at the other load balancer.
In summary, stateful server side load balancing may suffer from resource exhaustion, and also from the memory and processing overhead, inefficiency, and other issues (e.g., peering delay) of standard state replication mechanisms.
Stateful Load Balancing with Scheduling Offload
This scheme may share disadvantages of the Stateful scheme discussed above because the load balancer keeps the state table (e.g., a Flow ID to Blade ID mapping table). This state table may become very large when handling large numbers of data flows, thereby increasing memory requirements and table lookup times on the load balancer.
In addition, the controller node may be responsible for scheduling and updating the mapping table on the load balancer and may thus have the same/similar scale issues as discussed above with respect to other load balancing implementations.
Stateless Load Balancing
Static Mapping: Hash-Based Server Load Balancing
Hash-based server load balancing may depend on the arbitrariness of the traffic parameters (e.g., Flow ID, 5-tuple parameters, etc.) to provide a desired/required (e.g., substantially uniform) load distribution. If the probability distribution of the Flow ID of incoming data packets is known a priori, then it may be possible to design a Flow ID to Blade ID mapping (stateless, e.g., Hash) with Flow IDs as keys, that can substantially guarantee a desired/required (e.g., substantially uniform) flow distribution across the blades over a sufficiently large period of time. In many of the cases, however, a probability distribution of the Flow ID may not be known in advance and may change over time. Another challenge is that even if the statistical characteristics of the Flow IDs can be estimated accurately, lifetimes of the connections/flows are generally arbitrary. Accordingly, loads on the servers may change overtime even with a hash function aligned with the Flow ID pattern of the traffic. Any hash based scheme or static mapping may thus not guarantee uniformity at all times.
Also, hash-based server load balancing approaches may not sufficiently support load aware (e.g., dynamic, adaptive) load balancing. Considering dynamic traffic load characteristics (e.g., lifetimes of each connection and arrival rates), techniques in question may result in asymmetric load balancing among the blade servers. Changing the weights on the fly with respect to the load on the blade servers may be a challenge in this approach, because with the new weights, the existing flow can be reassigned to a new blade server which may terminate the flow.
Similarly, adding/removing blades to/from the load balancing site dynamically may be complicated in hash-based server load balancing, because any change in one of the hashing function parameters (i.e., number of blade servers to be load balanced) has the risk of spoiling the existing flows to blade server associations which may terminate the existing flows.
To be more precise, removal of a blade may be easier than addition of a blade. As discussed above with respect to static mapping, when a blade is removed, the flows mapped to the blade can be re-mapped to the other active blades while keeping the mapping on the active blades intact. Hence, there may be no disruption in flows existing on previously active blades (assuming the hash bucket size is much larger than the number of servers, otherwise the uniformity of the load balancing may become an issue).
When a blade is added, however, some of the flows mapped to the previously active blades should be re-mapped to the added blade (for uniformity/resource utilization) which may cause disruption of existing flows. Otherwise, the load balancer may need to identify which connection is added before and after the blade/server addition which may require state keeping in the load balancer.
If a backup load balancer is used for purposes of redundancy and/or resiliency, there is no need for flow state (e.g., table entries for flowIDs and server IDs) replication between the active and standby load balancers. The only thing to be synchronized between active and backup load balancers is the hash function itself
Non Static Mapping Schemes (Per Packet Server Load Balancing)
As discussed above, with non-static mapping schemes a load balancer does not keep a flow ID to blade/server ID mapping table because load balancing decisions (i.e., scheduling decisions) are made per packet.
A disadvantage may be that these schemes alone cannot be used for load balancing with flow awareness. For example, if 5-tuple parameter connection level load balancing is required and per packet server load balancing is performed, all the connections may eventually be terminated because the packets of a single connection may end up with several blades/servers, only one of which has the connection state information.
For flow aware scheduling, these schemes may be used in conjunction with stateful schemes (e.g., Table based), which may have other disadvantages as discussed above.
Protocol Specific Load Balancing
Protocol specific load balancing techniques may have a disadvantage of being non-generic as these techniques may only be applied to specific protocols such as SCTP and/or GTP. For example, such techniques may not apply to IP traffic.
Other Techniques
As discussed above, different load balancing techniques are provided for different data applications. In general, however, each of these load balancing techniques may only be suited to specific respective applications and may have limited scope for usage in other applications. A summary of characteristics of stateful, stateless static, stateless per packet, and stateless protocol specific load balancing schemes is provided in the table of FIG. 4.
Hash Based Implementation of the Load Balancer
Architecture of the Load Balancer
FIG. 5 illustrates a hash based implementation of a load balancing architecture. As shown, incoming traffic is first segmented into Hash Buckets, each of which is in turn assigned to a blade for processing. The architecture includes two stages in the load balancer. A first stage is a hash module that maps the a portion of the header field (e.g., flow ID, connection ID, and/or session ID) of the data packet to a Hash Bucket. As used herein, the terms hash bucket, and bucket may be used interchangeably. Each Hash Bucket is identified by an ID known as or Bucket ID. A bucket ID, for example, can be selected from a set including 1 up to a fixed number B. A second stage includes a table that maps Buckets to Blades. This table is known as or Bucket-to-Blade (B2B) Mapping Table. With Bucket IDs varying 1 through B, this table would have B rows.
For every incoming packet, the load balancer first computes the hash of the packet header (e.g., a hash of a Flow ID included in the packet header) to obtain a corresponding Bucket ID. The load balancer then maps the computed Bucket ID to a Blade ID using a look-up over the B2B (Bucket to Blade) Mapping Table. The load balancer then forwards the packet to the corresponding blade.
The first stage hash is a static mapping from Flow IDs to Buckets or Bucket IDs. Also, Bucket-to-Blade mapping can be considered static over a period of time. Therefore, this scheme may have an ability to maintain flow-level granularity. Determinism and dynamicity may be provided by modifying the B2B mapping table. In fact, it can be shown that a reasonably good algorithm to map Buckets to Blades may allow this scheme to have improved uniformity relative to a one-stage static hash from Flow IDs to Blades.
The load balancer of FIG. 5 may be relatively easy to implement. Because the table size is fixed to a distinct number of Bucket IDs, the table size may be fixed irrespective of the number of connections being supported. Accordingly, there is reduced risk that the load balancer may run out of memory as a number of connections increases. Because Bucket IDs are finite and ordered, table look up may be simplified.
Load balancing schemes of FIG. 5, however, may not provide sufficiently hitless support for addition and removal of servers (e.g., addition/removal of servers without effecting existing traffic flows). In general, data flows may be interrupted/lost when adding a blade, when removing a blade, and/or when remapping of Bucket ID to blade ID to provide load balancing (dynamicity), for example, to provide automatic load correction. Automatic load correction, for example, may occur in a VM (Virtual Machine) based service deployment where the VMs can be migrated from one Service Card SC (e.g., a Smart Service Card or SSC) to another for may purposes. For example, VMs can be consolidated in certain SSC cards so that other cards can be turned off when the traffic demand is lower for the purpose of saving energy.
In a scenario when a new blade is added, for example, a “Blade n+1” may be added to the original system of n blades illustrated in FIG. 5. To provide more uniform load balancing, Bucket IDs may be remapped from existing mapping to the “Blade n+1”. For example, Bucket ‘g’ may be one of the remapped buckets (i.e., Bucket ID ‘g’) is remapped to Blade n+1 from its initial mapping to Blade K). Packets for Bucket ‘g’ from previously existing flows/sessions which were destined to Blade K, however, will now be sent to Blade n+1, thereby disrupting the flow stickiness for these previously existing flows/sessions.
Similarly, a random blade K may be removed abruptly or as a pre-planned downtime. In this situation, all group IDs that were mapped to Blade K will now be remapped to some other Blade. The packets from existing flows/sessions that were being forwarded to Blade K will now be forwarded to the other blade and may thus be subsequently dropped (thereby disrupting the flow stickiness).
Buckets-to-Blades (B2B) mapping may thus be changed to provide better load balancing (uniformity). This situation is not unlikely because Bucket ID is nothing but a hash of flow ID of the packet, the distribution of which is unknown and may be sufficiently arbitrary that it causes uneven loads and/or numbers of connections to each bucket. In such a scenario, when a bucket is remapped from an initial blade ID to a new blade ID, all the existing flows which were destined towards the original blade will now be directed to the new blade and may therefore be disrupted.
In summary, the current implementations of load balancing may cause flow disruptions when B2B (Bucket to Blade) mapping is changed. Moreover, hash based load balancing may not support sufficiently hitless addition/removal of blades and/or remapping of blades. Stated in other words, existing connections through a bucket may be affected/lost when a mapping of a bucket is changed from one server/blade to another.