1. Technical Field
The embodiments herein generally relate to energy-efficient management of distributed computing resources and data centers, and more particularly to cloud computing.
2. Description of the Related Art
Within this application several publications are referenced by Arabic numerals within brackets. Full citations for these and other publications may be found at the end of the specification immediately preceding the claims. The disclosures of all these publications in their entireties are hereby expressly incorporated by reference into the present application for the purposes of indicating the background of the invention and illustrating the general state of the art.
Cloud computing has revolutionized the information and communications technology (ICT) industry by enabling on-demand provisioning of computing resources based on a pay-as-you-go model. An organization can either outsource its computational needs to the Cloud avoiding high up-front investments in a private computing infrastructure and consequent maintenance costs, or implement a private Cloud data center to improve the resource management and provisioning processes. However, the problem of data centers is high energy consumption, which has risen by 56% from 2005 to 2010, and in 2010 accounted to be between 1.1% and 1.5% of the global electricity use [20]. Apart from high operating costs, this results in substantial carbon dioxide (CO2) emissions, which are estimated to be 2% of the global emissions [14]. The problem has been partially addressed by improvements in the physical infrastructure of modern data centers. As reported by the Open Compute Project, Facebook's Oregon data center achieves a Power Usage Effectiveness (PUE) of 1.08, which means that ≈93% of the data center's energy consumption are consumed by the computing resources. Therefore, now it is important to focus on the resource management aspect; i.e., ensuring that the computing resources are efficiently utilized to serve applications.
One method to improve the utilization of data center resources, which has been shown to be efficient [25, 32, 40, 15, 16, 33, 19, 39, 21, 17, 7, 4], is dynamic consolidation of Virtual Machines (VMs). This approach leverages the dynamic nature of Cloud workloads: the VMs are periodically reallocated using live migration according to their current resource demand in order to minimize the number of active physical servers, referred to as hosts, required to handle the workload. The idle hosts are switched to low-power modes with fast transition times to eliminate the static power and reduce the overall energy consumption. The hosts are reactivated when the resource demand increases. This approach has basically two objectives, namely minimization of energy consumption and maximization of the Quality of Service (QoS) delivered by the system, which form an energy-performance trade-off.
Prior approaches to host overload detection for energy-efficient dynamic VM consolidation proposed in the literature can be broadly divided into three categories: periodic adaptation of the VM placement (no overload detection), threshold-based heuristics, and decision-making based on statistical analysis of historical data. One of the first works, in which dynamic VM consolidation has been applied to minimize energy consumption in a data center, has been performed by Nathuji and Schwan [25]. They explored the energy benefits obtained by consolidating VMs using migration and found that the overall energy consumption can be significantly reduced. Verma et al. [32] modeled the problem of power-aware dynamic VM consolidation as a bin-packing problem and proposed a heuristic that minimizes the data center's power consumption, taking into account the VM migration cost. However, the authors did not apply any algorithm for determining when it is necessary to optimize the VM placement—the proposed heuristic is simply periodically invoked to adapt the placement of VMs.
Zhu et al. [40] studied the dynamic VM consolidation problem and applied a heuristic of setting a static CPU utilization threshold of 85% to determine when a host is overloaded. The host is assumed to be overloaded when the threshold is exceeded. The 85% utilization threshold has been first introduced and justified by Gmach et al. [15] based on their analysis of workload traces. In their more recent work, Gmach et al. [16] investigated the benefits of combining both periodic and reactive threshold-based invocations of the migration controller. VMware Distributed Power Management [33] operates based on the same idea with the utilization threshold set to 81%. However, static threshold heuristics may be unsuitable for systems with unknown and dynamic workloads, as these heuristics do not adapt to workload changes and do not capture the time-averaged behavior of the system.
Jung et al. [19] investigated the problem of dynamic consolidation of VMs running multi-tier web-applications to optimize a global utility function, while meeting service level agreement (SLA) requirements. The approach is workload-specific, as the SLA requirements are defined in terms of the response time pre-computed for each transaction type of the applications. When the request rate deviates out of an allowed interval, the system adapts the placement of VMs and the states of the hosts. Zheng et al. [39] proposed automated experimental testing of the efficiency of a reallocation decision prior to its application, once the response time, specified in the SLAs, is violated. In the approach proposed by Kumar et al. [21], the resource allocation is adapted when the application's SLAs are violated. Wang et al. [34] applied control loops to manage resource allocation under response time QoS constraints at the cluster and server levels. If the resource capacity of a server is insufficient to meet the applications' SLAs, a VM is migrated from the server. All these works are similar to threshold-based heuristics in that they rely on instantaneous values of performance characteristics but do not leverage the observed history of system states to estimate the future behavior of the system and optimize the time-averaged performance.
Guenter et al. [17] implemented an energy-aware dynamic VM consolidation system focused on web-applications, whose SLAs are defined in terms of the response time. The authors applied weighted linear regression to predict the future workload and proactively optimize the resource allocation. This approach is in line with the Local Regression (LR) algorithm proposed in [3], which is used as one of the benchmark algorithms. Bobroff et al. proposed a server overload forecasting technique based on time-series analysis of historical data [7]. Unfortunately, the algorithm description is generally too high level, which does not allow for easy implementation to compare it with previous approaches. Weng et al. [35] proposed a load-balancing system for virtualized clusters. A cluster-wide cost of the VM allocation is periodically minimized to detect overloaded and underloaded hosts, and reallocate VMs. This is a related work but with the opposite objective—the VMs are deconsolidated to balance the load across the hosts.
As mentioned above, the common limitations of the prior works are that, due to their heuristic basis, they lead to sub-optimal results and do not allow the system administrator to explicitly set a QoS goal. Accordingly, there remains a need for a new and improved energy-efficient and SLA-based management of data centers for cloud computing.