The term “cloud” refers to a collective computing infrastructure that implements a cloud computing paradigm. For example, as per the National Institute of Standards and Technology (NIST Special Publication No. 800-145), cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.
The elastic nature of a cloud computing environment (a cloud infrastructure) thus enables on-demand resource acquiring and releasing in response to incoming variant workloads. As a result, tenants (e.g., customer or end-user of cloud infrastructure/services) should only need to pay for virtual resources they actually need, with their quality-of-service (QoS) requirements satisfied in the meantime.
A set of scaling rules determines when scaling operations are triggered and how many resources are allocated or de-allocated, and the operator can determine whether the system should take automatic action for these operations or whether they will simply trigger notifications (e.g., separate sequence necessary to perform the recommended operation). The service provider has to first estimate the capacity requested by each specific workload so that the QoS requirements specified in the service level agreement (SLA) between the tenant and the service provider are not violated, and subsequently demand the exact resources needed.