In recent years, cloud computing has evolved as a general concept for providing users with access to remote computing and storage resources. Cloud computing generally involves the provision of an abstraction to computing infrastructure which comprises large groups of networked remote servers, typically (but not necessarily) provided in a data center.
Cloud computing services, such as Software as a Service (SaaS), are commonly realized using virtualization technologies. Virtualization allows one or more virtual machines to be executed on a single physical computing unit, such as a server or server blade. Virtual machines are logically separated from each other and share processor, memory and/or network resources of the underlying computing unit. So-called hypervisors, also known as Virtual Machine Managers (VMM), are employed on the computing units for the allocation of hardware resources to the virtual machines executed thereon. An operating system may be installed on each virtual machine, just like on a physical machine, which may be used to execute application programs thereon.
A cloud infrastructure management system typically decides on the allocation of virtual machines to the physical computing units available in the cloud computing environment. If an allocation needs to be changed, virtual machines may be migrated from one computing unit to another. Migration may be performed on virtual machines while their execution continues (so-called “live migration”), or may require shutting down (so-called “cold migration”) or suspending (so-called “warm migration”) a respective virtual machine on one computing host and restarting the virtual machine on another computing host. Cold migration and warm migration may also be summarized by the term “offline migration”.
Virtualization of physical hardware resources provides numerous benefits. For example, virtualization allows multiplexing of resources across applications and enables for application isolation. Another advantage of employing virtualization is the ability to map resources to virtual machines or a group of virtual machines in a flexible manner in order to handle dynamically changing workloads. A workload increase, for example, may be handled by increasing the resources allocated to a virtual machine (so-called “scale up”) or by increasing the number of virtual machines while the virtual machines themselves continue to work on the same workload (so-called “scale out”).
For the scale out approach, load balancers are typically employed to distribute the workload among the virtual machines. The scale out approach is best suited to well-parallelizable workloads that can be divided into independent work units and that require none or only few communications between the respective virtual machines.
For the scale up approach, current techniques typically apply basic virtual machine and/or operating system level performance metrics to handle workload dynamics.
According to one such scale up technique, an optimization agent running on a virtual machine monitors resource usage. The agent may suggest configuration changes to a user and, based on a user response, the agent may inform the cloud management system to adjust resource allocation to the virtual machines accordingly.
According to another technique, the workload is measured by external observation of the virtual machines and, additionally, by observing operating system level statistics. Reconfiguring a virtual machine is decided based on the observed statistics and handled by live migration in case resources are not sufficient on the underlying computing host.
According to yet another technique, a host controller obtains virtual machine configuration data and, in response to an update of the computing host or an update of the operating system installed thereon, the host controller identifies the supported virtual hardware resources and updates the corresponding virtual machine with an optimized configuration.
In a Network Functions Virtualization (NFV) system, reconfiguration is handled by workload measurement-based scaling that supports three scaling categories: auto-scaling, on-demand scaling and management request. In auto-scaling, a manager entity monitors a Virtual Network Function (VNF) and triggers the scaling when certain conditions are met. In on-demand scaling, the VNF itself triggers the scaling through an explicit request to the manager entity. In management request, the reconfiguration is initiated by a cloud operator.
In each of these approaches, measured workload data only provides information about past resource usage and, accordingly, future resource requirements can only be estimated. Past and estimated future resource usage then forms the basis for the decision on how scaling is to be performed. In some cases, however, such estimation is not adequate to properly reflect future resource usage and, therefore, current techniques are not always capable of appropriately handling dynamic workloads.