Typically, a cloud operator employs a cloud orchestrator to manage the lifecycle of virtual resources in a cloud. The functions of the cloud orchestrator include functions such as: implementing user requests to deploy new virtual resources, including virtual applications and virtual machines (VMs); tracking the virtual resources which have been deployed; monitoring the status of virtual resources and the physical resources (hardware) that is needed to support them; and taking action when faults or performance degradations occur.
Virtual applications are implemented using a group of one or more VMs that work together to perform certain high-level functions. The telecommunications industry uses the term Virtualized Network Functions, or VNFs, to refer to virtual applications used in such context. Virtual applications are commonly engineered to include redundant resources to provide high availability and to provide additional capacity to process user requests. For example, a web application might have multiple VMs (functioning as web servers) to which user requests can be seamlessly directed and processed. In times of high user volumes, the number of VMs might be increased to handle the additional workload (known as scaling out the virtual application) and the number of VMs might be decreased when the workload subsides (known as scaling in the virtual application).
The cloud orchestrator typically keeps track of the virtual resources that have been deployed. This may be achieved by maintaining an inventory of the virtual resources (e.g., within a database) or via dynamic queries to the virtual infrastructure managers that manage those virtual resources.
The cloud orchestrator also monitors/measures the performance of the physical resources. Performance can be measured in the form of individual metrics such as Central Processing Unit (CPU) utilization, memory utilization, and network bandwidth utilization. The cloud orchestrator may gather these metrics, compare them against configured thresholds, and generate an alarm or other notification when a threshold is violated.
When an alarm is generated indicating that a threshold has been violated, action must be taken, or the performance of the virtual applications may suffer. These actions might include scaling in some virtual applications to reduce the overall utilization of physical resources. By selecting the appropriate virtual applications to scale in, resources can be freed to meet the demands of more critical virtual applications. Other actions might include shutting down some processes running on the VMs (without completely shutting down the VMs) to reduce resource utilization.
One of the major tenets of cloud computing is the ability to share resources. Often times, cloud operators will choose to allocate the same physical resource to multiple virtual resources (sometimes referred to as “overallocation” of a physical resource) on the assumption that the physical resources will not be utilized by all (or a substantial number) of the virtual resources at the same time. When that assumption fails, there may be a spike in the utilization of a particular physical resource (e.g., a physical host's CPU or memory or bandwidth of a physical network). The VMs (and by extension, the virtual applications that are associated with those VMs) sharing that physical resource can suffer, meaning that they may not be able to take advantage of the physical resources that they believe are available to them. This may negatively affect their operation and the services that they can provide to their end users.
Cloud operators need a strategy to deal with such a situation. Conventional techniques may deal with such a situation via manual operator analysis and intervention. In addition, conventional techniques may use blunt methods which do not take into account the individual requirements and capabilities of the affected virtual applications.