An application delivery controller (ADC) is a network device installed in a datacenter or multi-datacenter system to distribute load between servers in the datacenter. That is, an ADC typically distributes clients' requests between the servers in a datacenter to balance the load. The ADC is a network device and, as such, includes computing resources, such as memory, one or more central processing units (CPU), storage, network connectivity, and so on.
A virtual machine (VM) is a software implementation of a computer that executes programs like a physical machine. The virtualization technology decouples the hardware from software, thus allows sharing of the underlying physical hardware resources between different virtual machines, each running its own operating system (called guest operating system). Thus, the virtualization, which is typically performed by a hypervisor, allows multiple operating systems to run concurrently on a host computer. The hypervisor presents the guest operating systems with a virtual operating platform and monitors the execution of the guest operating systems. Further, the hypervisor defines the allocation of computing resources (e.g., CPU power, memory, network bandwidth, etc.) for each guest operating system.
Another technique for achieving software virtualization is by means of software containers (also known as container-based or operation system virtualization). Container-based virtualization is an approach to virtualization in which the virtualization layer runs as an application within the operating system. In this approach, the operating system's kernel runs on the hardware node with several isolated guests installed on top of it. The isolated guests are called containers.
Virtualization of an ADC device can simplify the network in the datacenters and reduce costs and overhead to the service providers. A disadvantage with the current virtualized ADC solution is that physical ADCs are designed to take and execute load balancing decisions in a physical location close to the servers, while a virtualized ADC may be placed anywhere on the virtual network. This deficiency less of a limitation when distributing north-south traffic (e.g., client-server traffic, where the client is connected to an external network), but can be a limitation in particular for handling east-west traffic direction. An east-west traffic direction refers to the case in which both clients and servers are virtual machines executed in the virtualized data center, on a virtual network. In the context of the data center, east-west traffic is the traffic that goes between servers in a given data center or between servers in different data centers. The virtualized ADC may reside anywhere in the data center and not necessarily be physically close to the server VMs. As a result, the current deployment and utilization of virtual ADCs in a cloud-based platform, datacenters, or software defined data centers is inefficient in the sense that it wastes network and compute resources. The limitations of the conventional solutions are further demonstrated in FIGS. 1, 2, 3, and 4.
FIG. 1 shows an architecture of a cloud-based platform 100. Virtual ADC instances (vADC) 110-1 and 110-2 are deployed between an external network 120 and a virtual network 130. A virtual network can host one or more tenants. A “tenant” is a group of one or more virtual machines hosted in physical machines and provisioned to provide services to a particular customer based on, for example, a service-level agreement (SLA). The isolation and independence of VMs and virtual networks allow for creating “tenants” and for providing multi-tenancy support in a datacenter. An external network includes a WAN, a LAN, the Internet, an Internet service provider (ISP) backbone, a corporate network, and the like.
A vADC 110 is configured with a virtual IP (VIP) address and port. Any client that requests services from servers 131 sends the requests to the VIP address. For example, a request from a client 140, connected outside the datacenter via the network 120, is addressed to the VIP of one of the vADCs 110, which distributes the requests to one of the servers 131 in the virtual network 130.
Therefore, all requests to the servers 131 are directed through one of the vADCs 110, and responses from the servers 131 are sent through the vADC 110 to the client. That is, directing all traffic via a central entity causes data to travel across the network at least twice (for example, a server's response must travel to client via the ADC). This results in an inefficient packet flow traffic and increased latency.
Another limitation of the conversional deployment depicted in FIG. 1 is that a pair of vADCs 110-1 and 110-2 are utilized for backup/failover purposes. That is, one vADC 110 serves as a backup for the other. As a result, one vADC 110 waiting for operation cannot be utilized to serve a request when additional capacity is required.
Furthermore, as the vADCs 110 are virtual instances, the hypervisor of a physical machine is required to perform additional processing tasks when handling requests and responses. That is, when traffic is load-balanced by a vADC 110, all requests and responses travel over the virtual network and via the hypervisor to the vADC 110. The limitation of conversional vADC deployment when handling east-west traffic is further demonstrated in FIGS. 2-4.
FIG. 2 illustrates distribution of east-west request from a client 210 to one of the servers 220-1 through 220-3 that are part of the same tenant within a virtual network 130. All clients and servers are virtual machines hosted in physical machines (hosts) 200-A through 200-C. In this example, the host 200-D hosts an instance of a virtual ADC (vADC) 230. An intra-tenant request (201) from a client 210 is directed to the VIP address of the vADC 230. The vADC 230 forwards the request to the server 220-1 for processing the request (202). The decision of which server 220 would serve the request is based on any known load balancing decisions. Then, the server 220-1 responds with a response (203) directed to the vADC 230, which returns the response to the requesting client 210 (204).
Thus, to serve an intra-tenant request, a hypervisor (HV) of each participating host is traversed (i.e., processing of a request/response by a hypervisor) at least one time, and a total of 4 times to complete a transaction. That is, in the above example, there are 4 “hypervisor hops” of host 200-A (1 hop), host 200-C (1 hop), and the host 200-D (2 hops). Each such hop requires forwarding and processing of packets by the hypervisor, context switching performed by the hypervisor, and so on. Further, as packets travel between hosts 200-A through 200-D, the network becomes congested as each request and response travels the virtual network twice. It should be noted that any request from any client regardless of the physical location of the client and/or server would require at least 4 hypervisor hops. For example, a request from a client 210 to the server 220-3, both residing in the same host, also requires sending the request and the corresponding response though the host 200-D.
As further demonstrated in the FIG. 3, inter-tenant requests (i.e., a request from a client that resides in one tenant to a server located in another tenant) requires at least 6 hypervisor hops. In the exemplary FIG. 3, a client 310 does not belong to the tenant of the servers 220-1 through 220-3. The client 310 sends a request (301) to a service from one of the servers 220. The request is directed to a VIP address, that being the vADC 230. However, in this case, the request is first received at a virtual router 320 that resides in the host 200-B. The virtual router 320 routes the request to the vADC 230 (302), which directs the request to the server 220-1 (303). The decision which server 220 would serve the request may be based on any known load balancing decisions. Then, the server 220-1 responds with a response (304) directed to the vADC 230, which returns the response to the virtual router 320 (305). The virtual router 320 relays the response to the client 310 (306). Thus, to serve an inter-tenant request, a hypervisor of each participated host is traversed (i.e., processing of a request/response by a hypervisor) at least one time, for a total of 6 times to complete a transaction. That is, in the above example, there are 6 hypervisor-hops of host 200-A (1 hop), host 200-C (1 hop), host 200-B (2 hops), and the hypervisor of host 200-D (2 hops). In addition, each request and response travels three times over the virtual network, further reducing available bandwidth.
The virtual ADC deployments as illustrated in FIGS. 1-3 can be scaled-up, but not scaled-out. Scaling-up of performance can be achieved by using high capacity hosts that can host high capacity vADCs (having many CPU and memory resources assigned to them). Scaling-out is required to achieve higher performance above the limits that a single host machine can provide, and without using special high performance host machines. That is, in contrast to a scale-up approach, where high capacity components (which suffer from higher per-unit-of-capacity prices) are used, in a scale-out deployment, a capacity of large numbers of low-cost components is utilized. Essentially, the scale-out approach treats hardware as a resource pool from which individual components can be allocated, on-demand, to workloads without manual intervention, thus the operational costs per-unit-of-capacity are smaller relatively to those in a traditional enterprise deployment.
As the conventional vADC deployment mandates that all requests will be distributed by a central vADC (addressed by a VIP address) and further require a high number of hypervisor-hops, these deployments cannot be scaled-out efficiently. That is, the conventional vADC deployment involves a virtual IP address that clients should address requests to in order to reach a service. Therefore, a vADC is an endpoint in the virtual network, and requests should be addressed to this endpoint to access the required service. Thus, scaling out such a service is difficult as that implies using multiple instances of the vADC all sharing the same VIP. There are mechanisms to distribute network traffic (such as, e.g., equal cost multi path). However, such mechanisms impose severe limitations and cannot be efficiently used for such a purpose.
A distributed vADC discussed in the related art suggests having a vADC with a lower capacity at some hosts in additional to a central virtual ADC that manages the overall distribution of traffic. The distributed vADC deployment is illustrated in FIG. 4. The vADCs 410, 420, and 430 are respectively hosted in hosts 200-B, 200-C, and 200-D, where vADC 430 can serve as the central traffic distribution point.
In this case, a client 210 sends a request (401) to a service from one of the servers 220. The request is directed to a VIP on vADC 430 that serves as a first distribution point. The vADC 430 directs the request to the vADC 420 (402), which directs the request to the server 220-2 in the host 200-B (403). The decision of which server 210 would serve the request is based on any known load balancing decisions. Then, the server 220-2 responds with a response (404) directed to the vADC 420, which returns the response to the vADC 430 (405). The vADC 430 relays the response to the client 210 (406).
Thus, in the distributed vADC deployment, to serve an intra-tenant request, there are at least 6 hypervisor-hops. That is, in the above example, the hypervisor of host 200-A is traversed 1 time, the hypervisor of host 200-B is traversed 1 time, the hypervisor of host 200-C is traversed 2 times, and the hypervisor of host 200-D is traversed 2 times. Thus, a sum of 6 hypervisor hops. Therefore, even by adding more vADCs to achieve a limited scale-out performance (limited by the capacity of the vADC 430), handling requests by a client requires a large number of hypervisor hops. The disadvantages of higher hypervisor-hops are discussed above. Thus, the distributed vADC deployment also requires a large number of hypervisor hops to serve a single client request and, further, the request and response travel over the network three times each.
Therefore, it would be an advantageous to provide a solution that overcomes the deficiencies noted above.