Virtualization is a desirable technology in many contemporary datacenter and cloud computing infrastructures. In general, virtualization provides for higher utilization for servers by running multiple applications in isolated containers (Virtual Machines or VMs) over a thin Virtual Machine Monitor (VMM) or Hypervisor (Hyper-V) layer. The Hyper-V virtualizes resources on the machine so as to give each VM container the illusion that it is the only operating system running on the server. In actuality, each container may run applications over an operating system that may be different across containers.
Virtualization involves multiplexing of CPU, memory, and storage resources across VMs, and much of the design work in the virtualization area considers how to do such operations in a performant and resource efficient way. CPU is virtualized using scheduling within and across cores. Memory is often allocated per VM and shared using dynamic memory management techniques. Disk storage is more difficult to virtualize because of interference between VMs that results from disk head seeks on the same spindle. Moreover, when storage is across the network (as in a separate storage cluster or across the WAN in a public cloud), access to storage also involves network latency and has to deal with network bandwidth constraints, because the datacenter network is shared across many applications and is often over-subscribed.
A virtual hard disk (VHD) file comprises file content that appears to each virtual machine as if it is the virtual machine's own hard drive. One attempt to make storage more efficient in size and access is to use read-only VHDs that use “gold” master images as their underlying content, and then track individual chains of differences/deltas as the hard drive content changes over time. This is undesirable for various reasons.
One example scenario is Virtual Desktop Infrastructure (VDI) where virtual desktops boot and run off VHDs. Even with gold images and deltas, the often poor virtualized storage performance for VDI can cause slow boot/login performance and reduced application performance, which impacts and degrades the user experience. For example, when many users reboot or login at the same general time, such as just after lunch, it can take considerable time (on the order of several minutes) for a user to have a reasonably functioning machine.