Currently many public cloud providers operate datacenters that are geographically distributed across many locations. Such datacenters host thousands of services, which are typically run on virtual machines. These virtual machines are typically running on multiple server end stations (e.g., a computing device) and each data center may have many hundreds of these server end stations. These services are then provided to cloud customers based on a pay-as-you-go model. Examples of services include content distribution services, cloud-based email, render farm services, etc. One large issue that cloud providers are faced with regularly is resource fragmentation and unbalance. These resources may include processor (CPU), memory, network, storage, and other resources. Resource fragmentation and load unbalance arise when existing services are stopped by cloud customers, due to server end station downtime, due to errors, or other reasons. For example, in the event of a server end station failure, services previously running on the server end station are terminated and/or restarted on another active server end station. When the failed server end station becomes available again, it does not host any services and thus resources are wasted. Similarly, if the cloud customers stop individual services, server end station resources are freed, thus creating resource fragmentation, whereby some server end stations have excess residual resource capacity and some do not. Resource fragmentation is especially bad for cloud providers as it prevents the cloud provider from accommodating more customers and thus reduces the overall revenue for the cloud provider.
One way to achieve improved resource utilization and load balance in the presence of stopped services and/or server failures is to perform load rebalancing. Load rebalancing aims at achieving a better load distribution by migrating already running services between servers. By performing load rebalancing Service Level Agreement (SLA) violations may be avoided due to shortage of resources on an over utilized server. SLAs define various performance guarantees between a provider and a customer. Load rebalancing may at the same time achieve a better utilized cloud environment.
Several approaches have proposed for load rebalancing. As a most optimal load rebalancing method would run very slowly (in non-deterministic polynomial-time hard or NP-hard), most of the related methods focus on heuristics and/or greedy algorithms. Specifically, load rebalancing algorithms can be classified into static or dynamic algorithms.
Static algorithms use the prior knowledge of resources and are suitable for stable environments in which the services' resource consumption do not vary over time. The round-robin algorithm is a famous static algorithm that allocates services on a first come first serve basis to (under-utilized) servers. The Central Load Balancing Decision Model is an improved round-robin algorithm where the server end station response time is measured. If the response time is above a threshold, then the server end station is over utilized and the service is allocated to the next server end station.
Dynamic algorithms take into account the changing demands of the services during the execution time. One example of a dynamic algorithm is Weighted Least Connections, which takes into account the number of services in a server when placing a new service. Another example is Exponential Smooth Forecast based on Weighted Least Connection. This algorithm uses a single exponential smoothing forecasting mechanism to predict the demand in real time based on the historical demand of the service. Yet another example is the Load Balance Min-Min algorithm that uses opportunistic load balancing for allocating tasks in an attempt to keep each server end station busy. This algorithm considers the execution time of each task and allocates tasks based on remaining server end station CPU capacity, remaining memory, and the transmission rate.
However, performing load rebalancing comes at the expense of service migration costs. Thus, many of the previous methods, which may not consider migration costs, may not be optimal. These previous methods may assume that the destination server end stations (where services are to be moved) are empty, and may not consider migration restrictions on individual services.
Thus, a better solution is desired that considers restrictions and minimizes migration costs.