The present invention relates generally to cloud computing and relates more specifically to the provisioning of virtual machines in the cloud.
A virtual machine is a software implementation of a machine (e.g., a computer) that executes programs like a physical machine. When a new virtual machine instance is to be provisioned in the cloud containing multiple hypervisor host machines, one must determine which of the host machines is best suited to host the new instance.
Typical placement algorithms identify the best suited host machine based on resource availability at the host machine (e.g., central processing unit, disk, bandwidth, and/or memory availability). For instance, a placement algorithm may divide each host machine into a fixed number of “slots” (i.e., a certain number of cores and memories) and allocate virtual machine instances to free slots (e.g., based on round robin, lowest slot number first, or other allocation schemes).
Once a target host machine is selected, the virtual machine instance is provisioned by first copying the virtual machine image from a storage server to the target host machine. This process consumes network and storage server bandwidth and adds latency to the provisioning process. Notably, virtual machine provisioning time is a key metric of cloud elasticity (i.e., ability to handle sudden, unanticipated, and extraordinary loads), and cost minimization is closely tied to resource usage.