Cloud computing includes an attractive feature of elasticity. Elasticity is the ability to dynamically acquire or release resources based on demand. In other words, applications are allowed to pay only for the resources they use and to make scaling decisions without human intervention.
However, achieving efficient auto-scaling poses three key challenges. First, cloud systems find it difficult to quickly respond to increased demand as they incur significant start-up delay (on the order of tens of minutes) to launch new instances. Second, many applications exhibit bursty arrivals and non-uniform task execution times. Third, unexpected delays and provisioning failures due to the distributed execution environment in the cloud can reduce the efficiency of auto-scaling operations.