Computer virtualization is a technique that involves encapsulating a physical computing machine platform into a virtual machine that is executed under the control of virtualization software on a hardware computing platform, or “host.” A virtual machine has both virtual system hardware and guest operating system software. Virtual system hardware typically includes at least one “virtual disk,” which is represented as a single file or a set of files in the host's file system, and appear as a typical storage drive to the guest operating system. The virtual disk may be stored on the host platform's local storage device (if any) or on a remote storage device. Typically, a virtual machine uses the virtual disk in the same manner that a physical storage drive is used, to store the guest operating system, application programs, and application data.
A snapshot of the virtual disk can be taken at a given point in time to preserve the content within the virtual disk at that point in time, referred to herein as a “point in time (PIT) copy of the virtual disk.” Once a snapshot of a virtual disk is created, subsequent writes received from the guest operating system to the virtual disk are captured in a “delta disk” so that the preserved content, i.e., the base PIT copy, is not modified. The delta disk is an additional file associated with the virtual disk. At any given time, represents the difference between the current state of the virtual disk and the state at the time of the previous snapshot. Thus, the base PIT copy remains intact and can be reverted back to or can be used as a base template to create writable virtual disk clones. Multiple PIT copies of the virtual disk can be created at various points in time by creating snapshots of snapshots. Each snapshot corresponds to a separate delta disk that is overlaid on a previous delta disk.
Creating multiple snapshots of a virtual disk results in a long chain of delta disks, each corresponding to a snapshot of the virtual disk. Every read IO operation to the virtual disk has to traverse through each delta disk associated with the virtual disk to get the latest copy of the data from a delta disk. Therefore, an increased number of delta disks negatively impacts the performance of read IO operations to the virtual disk. Performance of such IO operations may be increased when redundant delta disks are consolidated to reduce the number of delta disk in a given chain. Redundant delta disks are associated with PIT copies of the virtual disk that are no longer needed. For example, a PIT copy of the virtual disk may created for backing up or testing purposes and becomes redundant upon backup completion or when the testing is successful.
Delta disks are consolidated by merging PIT copies such that a particular delta disk can be deleted. Merging the PIT in copies typically involves copying out data from the delta disk to be deleted (the “source delta disk”) to the main primary disk or an adjacent delta disk (either, referred to generally as the “destination delta disk”). Copying data in such a manner from the source delta disk to the destination delta disk involves data movement operations that cost a significant amount of IO and CPU resources. As the size of data in the source delta disk increases, the data movement operations that are necessary to consolidate two delta disks becomes very IO intensive. Thus, during consolidation, the IO performance for the virtual disk as a whole degrades drastically when a delta disk consolidation operation is in process.
As the foregoing illustrates, what is needed in the art is a mechanism for consolidating delta disks with minimal impact to IO operation performance within the virtual disk and minimal data transfer overheads.