Infrastructure as a service (IaaS) delivers computer infrastructure as a service in a virtualized computing environment. Service subscribers and clients, rather than purchasing servers, software, data center space or network equipment, instead buy those resources as a fully outsourced service. The service is typically billed on a utility computing basis. The amount of resources consumed and therefore the costs are based on the level of subscriber activity.
In general, a server layer is configured to include computer hardware or computer software products that are specifically designed for the delivery of services to a particular subscriber. The services may include access to multi-core processors, cloud-specific operating systems and additional computing services. Some IaaS provisioning platforms provide resources for information technology communication (ITC) where availability of resources is guaranteed according to service level agreements (SLAs).
One embodiment of the IaaS paradigm is an infrastructure computing cloud (ICC). In an ICC, the subscribers purchase ITC resources in the form of virtual machines (VMs), virtual storage, or virtual networks. The subscribers are charged according to a pay-as-you-go model. The subscribers may purchase ITC resources from an ICC provider by a single subscriber in the framework of a single SLA as virtual resources set (VRS). With ICC, cloud subscribers are offered capacity on demand to match variations in workload. Accordingly, the number of VM instances in VRS may dynamically change.
To explicitly set contractual obligations on availability of resources, ICC subscribers may specify the range of resources of every type that is needed. That is, a subscriber may designate the maximum or minimum number of VM instances of every type of service that it may want to reserve. The values that set the reserved range for each resource are referred to as capacity ranges. Availability SLA for VMs within each range is specified by way of one or more availability service level objective (SLO) clauses.
ICC providers strive to maintain a minimal capacity that is sufficient to guarantee the VRSs' SLA commitments subject to the acceptable risk level of non-compliance, as controlled by the ICC provider's business policy. The minimal capacity is the equivalent capacity that is defined for each type of resource, based on the number of instances of the resource that is needed to support all the resource demand of services with the particular level of congestion as a function of the acceptable risk level.
The equivalent capacity may be presented as a vector of resources, where for each type of resource, it is indicated how many instances of a resource are to be dynamically provisioned or deprovisioned to satisfy the calculated risk level. When system load increases, more VMs may be provisioned to prevent over-utilization. When the load subsides, some VMs may be deprovisioned to prevent under-utilization.
As an example, VMs may be dynamically provisioned and deprovisioned in a high performance cluster (HPC) service or a virtualized data center. Since maximal demand, in terms of number of VMs, of different services or users in the above-noted virtualized computing environments usually do not peak together, over-committing of the physical capacity of the VMs is possible.
Over-committing increases resource efficiency so that more VMs can be hosted on the same physical infrastructure than allowed by the total physical capacity. Over-commitment of resources is desirable and may be tolerated only to the extent that the over-commitment does not result in excessive risk of resource congestion and non-compliance with SLAs.