Virtualization technologies such as VMware and Microsoft Virtual Server are becoming prevalent in the market place. These technologies provide a virtual hardware abstraction to guest operating systems, and allow them to run as applications (virtual machines) in a functionally isolated environment on a host computer without being modified. Virtualization allows multiple virtual machines to run on a single physical server (host computer) simultaneously, providing functional and performance isolation for processor, memory, storage, etc. among the multiple instances of virtual machines. It is common to duplicate a base virtual machine, sometimes making many copies.
Like physical machines, virtual machines have storage media such as hard disks (virtual hard disks, in the case of virtual machines), along with other peripheral devices. Typically, a virtual machine's virtual hard disk is used to store the base operating system, application programs and application data.
Typically, when a virtual machine hard drive is created, one of two methods are used. According to the pre-allocated disk method, space is allocated up front for all the disk space required for the virtual hard disk. Under the sparse disk method, the initial hard disk contains only meta-data but not the actual data, and the hard disk size grows as data is written to the hard disk. Upon creating an empty virtual hard disk, an operating system and application program can be installed, and the hard disk can be put into a state ready for duplication.
Operating systems are quite large. For example, abase installation of Window 2000 requires 600 megabytes, Windows Vista requires up to 15 gigabytes and RedHat Linux 4 requires 200 megabytes to 4 gigabytes. Thus, common virtual machine disk sizes are from tens to hundreds of gigabytes. Due to their large size, virtual hard disks make virtual machines difficult and time-consuming to manage, duplicate, replicate, backup, move and deploy.
For example, suppose we have a virtual machine A with hard disk 1, and we want to create an identical copy of machine A to produce machine B with hard disk 2. The conventional method of duplicating the hard disk involves copying the existing hard disk bit by bit into a second virtual hard disk. This is time consuming, and requires at least the same amount of disk space as the original hard disk. FIG. 1 illustrates the duplication of a virtual machine, according to this conventional method. As illustrated, Machine A and 100 gigabyte Hard Disk 1 are copied to Machine B and 100 gigabyte Hard Disk 2.
Both VMware and Microsoft virtualization technology support Redo logs for virtual hard disks. As illustrated in FIG. 2, Redo logs capture the differences between a specific base state of a hard disk and subsequent modifications made to that hard disk. The behavior of a Redo log is that write operations to a disk block are routed to the Redo log. Read operations on a disk block read the block from the Redo log if the block exists in the Redo log. Otherwise, the read operation attempts to read from the parent disk. However, when copying (or otherwise manipulating) virtual hard disks with Redo logs, the base virtual hard disk and all associated Redo logs have to be copied (or otherwise processed).
What is needed are methods, computer readable media and computer systems for more efficiently copying and otherwise processing virtual hard disks.