Many web applications such as those that provide data services use large amounts of provisioned network resources. These applications/data services may be run on cloud computing resources to service client requests. For example, Amazon® Elastic Compute Cloud (Amazon EC2®) is a cloud-based service that supports enterprise data services by providing variable computing capacity for a fee. It is also feasible to provision computing resources within an enterprises' own network hardware, and re-provision them as needed, and/or supplement enterprise network hardware with cloud computing resources when enterprise network capacity is exceeded.
The provisioning and deprovisioning of computing resources can be significant in terms of cost savings, as the load handled by a data service can be highly variable, such as peaking at different times during the month when payroll is processed, during a peak shopping season, and so on. Thus, provisioning and deprovisioning technology attempts to match provisioned computing resources to current needs. For example, Amazon® has a concept of an Auto Scaling Group (ASG) in its cloud system, which automatically provisions (scales up) additional EC2® resource instances after detecting increases in certain traffic/load-related metrics, such as CPU or memory utilization. Deprovisioning is similarly automatic as load decreases.
However, contemporary provisioning technology including EC2® reacts to events as they transpire, including events that indicate increased traffic and load on software services. When provisioning of new resource instances is needed, it takes a few minutes for these additional instances to “spin up” to handle the increase in load. More particularly, the total time taken for provisioning new resources—from the time of the increased traffic to the availability of metrics showing the increase, to the reaction of the system to decide to scale up, to the in-service availability of the additional resources—is on the order of minutes. During this scaling-up time period, the service or system is often unable to handle the full load.
An alternative approach is to provision as many computing resources as needed to handle peak load, and leave these resources in place (over-provision) during periods of low traffic. For example, in Amazon's DynamoDB®, there is no automatic scaling built into the system. The DynamoDB® technology instead relies on the client to provision sufficient read and write capacity to handle peak load, and generally leave this peak-provisioning in place during periods of low traffic (although some client-controlled reduction in capacity is available to a limited extent). This wastes resources, costing money whether an enterprise has to buy additional internal network capacity to over-provision resources, or pay for dedicated external cloud computing resources that often go unused.