Cloud-computing and cloud-based services occupy a fast growing slice of the market. One of the main benefits of the cloud service is being able to “consume exactly what you need”, i.e., services are consumed on demand. As a result, there is no need to purchase dedicated IT infrastructure (e.g., servers, network devices, and storage) sized for the maximum foreseeable utilization, which will still incur fixed operational expenses in addition to the initial capital expenditures for the equipment, but will be idle for most of the time.
One of the common cloud business models is payment for the actual utilization of computing resources at each point in time (as known as “pay as you go”). In implementation, the number of virtual machines (VMs) that are actually in use, their computing capacity (memory, CPU, and the like), and the volume of traffic flowing between the “cloud” and the outside world (WAN/Internet) is measured to determine the utilization of computing resources to ascertain the actual cost for the customers.
The main requirements in such services are a large number of concurrent tenants, and providing, for each tenant, high variability in the required capacity, i.e., between different services and especially within the same service at different points in time. Tenants are customers that deploy services in the same cloud. For example, a service that deals with last-minute sales or events at specific points in time requires very high capacity at peak times, and almost zero resources at other times.
As a consequence, such cloud services must be highly scalable and dynamically adaptive to match the requirements of all cloud-customers at any given point in time, as the resources per each cloud-customer are allocated to fit the actual and current needs of the tenant. The combination of these requirements is also known as elasticity.
One of the computing resources utilized in datacenters, and hence in cloud-computing environments, is an application delivery controller (ADC). An ADC is a network device installed in a datacenter or multi-datacenter system to remove load from web servers in the datacenter. That is, an ADC typically distributes clients' requests between the web servers in a datacenter to balance the load. In a multi-datacenter system, an ADC is deployed in each datacenter to redirect clients' requests to a datacenter that would best serve such requests. Typically, the redirection decision is based on the location of the client from the datacenter. The ADC is a network device and as such, includes computing resources, such as memory, one or more central processing units (CPU), storage, network connectivity, and so on.
Virtual instances of an ADC device can improve the performance of datacenters and reduce costs and overhead to the service providers. Similar to any other data center application, the ADC devices of different customers or applications can be consolidated as multiple virtual ADC instances running on a single hardware device.
Although virtual ADC provides flexibility in terms of the number of virtual instances, as well as computing resources that can be allocated or de-allocated within a physical device to provide ADC services, the capacity of a virtual ADC is typically limited by the computing resources of the physical device hosting the virtual instances. The computing resources include computing power (e.g., number of CPU cores), bandwidth, memory, and the like.
In order to increase the capacity when providing ADC services, prior art solutions suggest clustering a number of physical ADC devices. An exemplary diagram of such a cluster 100 is provided in FIG. 1. In this example, three physical ADC devices 110, 111, and 112 are connected to a client network through a switch 120, and to the server network (in the datacenter) through a switch 130. The cluster illustrated in FIG. 1 requires a backplane switch 140 to allow communication between all the devices 110-112. The switches 120, 130, and 140 are standard network switches. The traffic distribution in the cluster 100 is performed based on link aggregation (LAG) provisioning of the switches 120 and 130.
The cluster 100 suffers from a number of limitations, one of which is that all ADC devices 110-112 should have the same capacity. Another limitation is that any change to the initial configuration of the cluster 100 requires implementing an ADC persistence correction process using the backplane switch 140, which is mainly added for this purpose. Furthermore, any ADC device added to the initial cluster must be connected to the backplane switch 140 and any traffic routed to such an ADC device must first pass through any of the ADC devices 110-112. This is performed in order to comply with the initial LAG distribution function with which the cluster is configured. Yet another limitation of the cluster 100 is that the switches 120, 130 and 140 should be physically co-located near to the ADC devices 110-112.
As a result of the limitations of conventional ADC clustering techniques, current solutions cannot efficiently support elasticity of ADC services. That is, the cluster 100 may not be dynamically adapted to provide additional or less capacity for ADC services on-demand. In cloud-computing environments, high availability and elasticity of the supplied services are essential.
In attempt to cure such a deficiency, a DNS-based elasticity technique can be utilized. In such a technique, a DNS is utilized to distribute the load among the ADC devices in the cluster. However, this technique also suffers from some drawbacks, including for example, poor adaptability, performance, and latency. The poor adaptability results from the fact that DNS responses are cached and it takes a long time to respond to changes in the cluster configuration. Thus, an ADC may become overloaded or completely unavailable in the interim since the last time the DNS server returned its IP address to the client. Trying to solve the poor adaptability issue by setting the time-to-live (TTL), would create a performance bottleneck around the DNS server due to the ensuing flood of requests and also requires more computing capacity. An attempt to solve the performance issue would increase the latency per connection setup, because of the DNS roundtrip prior to the connection setup.
Therefore, the conventional ADC clustering techniques are not optimized for cloud-computing environments and for providing elasticity in ADC services.