In the case of a (physical) machine, the software that defines the functionality for the machine can be stored on non-transitory mass storage media, e.g., a hard disk. A hard disk is typically formatted into sectors, and an operating system typically stores data in clusters, which are contiguous groups of sectors. The operating system also typically aligns files with cluster boundaries, e.g., most files begin at a respective cluster boundary. The data physically encoded on the hard disk forms, in effect, a two-dimensional arrangement of representations of bits. This two-dimensional representation is often referred to as a disk image. The functionality of a computer can be transferred to another computer with identical hardware by transferring the disk image.
Herein, “machine” refers to the hardware of a computer. A typical machine is managed by an operating system, which typically hosts a computer application. A “virtual machine” is not a machine, but is software that appears as if it were a machine to a “guest” operating system that it hosts.
As with a physical machine, the functionality of a virtual machine can be physically encoded onto a hard disk, in this case, to form a virtual-machine image. However, the virtual machine image can include the virtual-machine itself in addition to a guest operating system and application software. This means that the functionality of a virtual machine can be transferred between machines with dissimilar hardware, as long as the machines are running compatible hypervisors (i.e., virtualizing operating systems).
The fact that virtual machines can be packaged as virtual-machine images has many advantages. For example, if a virtual machine is running up against hardware limitations of its physical machine host, its image can be cloned and the clone can be transferred to other hardware so that two instances of the virtual machine can be operated in parallel to increase throughput. Of course, this “scaling out” can be expanded to larger numbers of parallel instances of a virtual machine.
As a result of this versatility, virtual machine images can proliferate, consuming storage capacity where they reside and bandwidth as they are transferred. Their relatively large sizes, e.g., tens of gigabytes, can tax storage and communications resources. Compression of virtual-machine images can save storage capacity and bandwidth, but sometimes the necessary processing power results in a poor tradeoff between cost and benefit.