Load balancers are typically used in a computer communication network to distribute traffic load across clustered CPUs and network infrastructure in order to increase network reliability and performance while introducing the benefits of redundancy. In traditional networks employing load balancers, incoming packets having requests from clients were assigned a Virtual IP (“VIP”) on the load balancer itself and then the load balancer would pass the request to the appropriate server with negligible modification to the packets. The server would then respond to the load balancer with the required data, which would be relayed onto the client by the load balancer.
This type of network configuration, however, has a major drawback. Incoming requests are typically small, e.g., 20 Mbits, but their associated replies are typically up to ten times larger, e.g., 200 Mbits. As traffic needs to transit or pass through the load balancer on high traffic networks, the risk of the load balancer acting as a bottleneck rises considerably and network performance consequently suffers. Direct Server Return (“DSR”) was introduced into the feature set of load balancers to address this drawback.
DSR modifies the traffic flow by permitting the server to respond directly to the client, thereby, relieving the network load balancer of the need to handle the heavy traffic load. FIG. 1 illustrates a conventional communication network with a load balancer employing Layer-2 DSR (“L2DSR”). With L2DSR, the client 105 with exemplary IP address 1.1.1.1 sends a request out through the Internet 150 to a VIP 2.2.2.2 served by load balancer 115. The load balancer 115 determines the real server destination (e.g. server 130) to forward the request to and also performs the MAC Address Translation (“MAT”) necessary for this operation, e.g. translation of MAC address AAAA.AAAA.AAAA to the server's MAC address of BBBB.BBBB.BBBB. The source and destination IP addresses are preserved because the server 130 needs both addresses to be able to effectively communicate with the client directly. It needs the client's original IP address to know where to transmit the response data packets and it needs the client's original destination VIP to use as the loopback IP, so that the client 105 can recognize the source of the packets it receives from server 130. Using the VIP 2.2.2.2 as the loopback IP, the server 130 can then respond directly to client 105, thereby advantageously bypassing the load balancer 115.
One significant drawback of L2DSR is that because the load balancer 115 forwards incoming packets from the client 105 to the server by changing the destination MAC addresses of the incoming packets on the fly, both the load balancer 115 and the server 130 unfortunately need to be on the same L2 network segment. In other words, the network between the load balancer 115 and the server 130 is limited to being a Layer 2 network that needs to operate on MAC addresses at the Layer 2 level. Because the load balancer 115 and server 130 need to be on the same network segment, the networks that can be constructed using L2DSR are relatively constrained. The physical location of the servers is restricted and flexibility within the data center is greatly limited. This creates instability on very large networks, e.g., networks with more than 10 k hosts.
Layer 3 DSR (“L3DSR”) addresses the above stated constraints of networks using L2DSR. L3DSR dispenses with the requirement of performing a MAC Address Translation at the Layer 2 level. Instead, the load balancer sends the request received from the client to the server using a destination IP different from the VIP initially requested by the client. FIG. 2 illustrates an exemplary packet flow using L3DSR. As illustrated in FIG. 2, the load balancer changes the destination IP in the packets received from the client explicitly to reflect the server's real IP (74.80.1.1). However, the load balancer in an L3DSR network still needs to transmit (to the server) the source address of the client and also the original destination address requested by the client (the VIP address for which the request was made).
FIG. 3 illustrates the IPv4 header format. L3DSR uses the Differentiated Services Code Point (DSCP) field 302 to communicate the VIP address for which the request was made to the server. The 6 bits in the DSCP field are used by the load balancer to encode the VIP address, typically, by performing a known mapping. The server then needs to derive the full VIP address from the information relayed by the load balancer by performing a look-up in a look-up table. For example, as shown in FIG. 2, the server uses the DSCP bits as set by the load balancer (0×4) to derive the IP address that it uses as the source IP (e.g. 198.18.0.250) to communicate with the client. In other words, once the server receives the packets from the load balancer, the server checks the DSCP bits and uses the mapping to determine the IP address which it will use to communicate back with the client. As with L2DSR, the VIP may be configured as the loopback IP and the server responds to the client using the client's original source IP as the new destination IP and the client's original destination IP (the VIP as determined by the mapping) as the new source IP.
While L3DSR improves on L2DSR by not requiring the server and the load balancer to be on the same network segment, it is also restrictive because it relies on, and thereby consumes, the DSCP field of an IPv4 header to communicate the destination VIP address. First, the DSCP field is a narrow field and, accordingly, only a limited number of mappings can be created. Second, use of the DSCP bits for L3DSR prevents them for being used for other purposes, e.g., to provide Quality of Service (QoS) information. Third, storage and computing resources are utilized on both the load balancer and server to encode and decode the DSCP bits. Lastly, all the servers and load balancers across the network need to keep track of the DSCP to VIP mappings and any updates need to be reflected across all devices to ensure consistency.