1. Technical Field
The present invention relates generally to managing virtual hard disk snapshots, and in particular, to a computer implemented method for creating and managing a hypervisor agnostic virtual hard disk.
2. Description of Related Art
Hypervisors may be utilized to provide a virtual environment suitable for allowing multiple operating systems to run on a host system such as a server. A hypervisor, also referred to as a virtual machine manager, provides a virtual machine for each guest operating system, thereby allowing each guest operating system to operate independently of other guest operating systems. Each virtual machine may be allocated an apparent processor, memory, virtual hard disk, and other resources by the hypervisor. Although each guest operating system appears to have the host system processor, memory, hard disk and resources to itself, the hypervisor is actually controlling the host processor(s), memory, hard disk and other resources and allocating what is needed for each guest operating system to operate independently without disrupting each other.
A hypervisor may be a native or bare-metal hypervisor, also known as Type 1, or a hosted hypervisor, also known as Type 2. A bare-metal hypervisor does not require installation of a host operating system. With bare metal virtualization, the hypervisor has direct access to hardware resources which typically results in better performance, scalability and stability. However, with bare metal virtualization, hardware support is typically more limited because the hypervisor may have a limited set of hardware drivers. A hosted hypervisor requires installation of a host operating system. This allows multiple hypervisors to run concurrently on the host operating system. This also generally provides better hardware compatibility than a bare metal hypervisor because the host operating system is responsible for the hardware drivers. However, because a hosted hypervisor does not have direct access to hardware and must go through the host operating system, there is greater overhead which can degrade virtual machine performance. In addition, there may be resource allocation issues due to other hypervisors, services or applications running on the host operating system, further degrading virtual machine performance.
For data protection, a virtual machine may be allocated a snapshot of a virtual hard disk by a hypervisor. A virtual machine snapshot is a file-based image of the state, hard disk data, and configuration of a virtual machine at a specific point in time. The use of a snapshot provides greater data protection and rapid restoration of the virtual hard disk if needed. A snapshot of a virtual machine hard disk can be taken at any time, even while it is running. The virtual machine can then be reverted to a previous state by applying a snapshot of that previous state to the virtual machine.
FIG. 1 is a block diagram of multiple snapshots organized as a virtual hard disk 100 in accordance with the prior art. Multiple snapshots may be organized in a tree structure. A virtual machine may be reverted to an earlier time of processing at the time that a snapshot was generated.
In this example, a base snapshot 110 was generated at time T0. At time T1, a subsequent snapshot 120 is generated. Initially, snapshot 120 may be just a shell if copy-on-write (CoW) snapshot technology is utilized. Other types of snapshot technology may utilize other type of data storage systems. The hypervisor tracks that T1 is a derivative of T0. The manner of this tracking depends on the type of hypervisor. At time T2, another snapshot 125 is generated. The hypervisor tracks that T2 is a derivative of T1 which was a derivative of T0. Subsequent to time T2, a user or other controlling entity returns the virtual machine back to snapshot 120 and starts processing again. Then at time T3, another snapshot 130 is generated. The hypervisor tracks that T3 is a derivative of T1 which was a derivative of T0. Subsequently at time T4, another snapshot 135 is generated. The hypervisor then tracks that T4 is a derivative of T3 which was a derivative of T1 which was a derivative of T0.
In addition to the tracking of the hierarchy of the snapshots, various pointers may be utilized by a snapshot technology (ST) such as copy-on-write (CoW) snapshot technology. For example, in CoW snapshot technology pointers are utilized by a CoW layer to manage the location of data among snapshots. When data in T1 snapshot 120 is being updated (prior to time T2 or time T3), the underlying data may actually be stored back in the base snapshot. As a result, when that underlying data is updated, the underlying data may actually be written to snapshot T1 before writing the updated data to the base snapshot. All of this is managed by an ST layer (CoW in this example) without requiring the intervention or management of the Hypervisor. However, the hypervisor does keep track of the snapshot hierarchy and which snapshot is being utilized at any given time, and that information is provided to the ST layer as needed.
If a user or other controlling entity wants to revert back to snapshot 130, then the hypervisor will generate a configuration 140 based on snapshot 130, snapshot 120 and base snapshot 110. In a first step 150, the hypervisor assembles the snapshots needed (snapshots 110, 120 and 130 in this case) from the tracked snapshot hierarchy and builds a virtual disk based on those snapshots. In a second step 155, the hypervisor boots the virtual machine guest operating system (guest OS). Thirdly, in step 160, the booted guest OS utilizes the virtual disk from step 150 and processing continues.
There are several implementations of snapshot technology (ST) including copy-on-write (CoW), redirect-on-write, split mirror, etc. With copy-on-write, a snapshot is created using only meta-data (i.e. pointers) about where the original data is stored in blocks of virtual disk memory. As a result, no physical copy of the underlying original data is made at snapshot creation. An ST layer invoked by the hypervisor utilizes a set of pointers to track any writes intended for the original data blocks indicated by the meta-data pointers. To prevent overwrites of original data, the data at those original block locations are copied to the appropriate snapshot just before any writes of new data to those block locations are performed. As a result, an original data block is preserved by copy to the appropriate snapshot just prior to new data being written to that original data block, hence the term copy-on-write. A copy-on-write snapshot can be created nearly instantly, but each write to certain data block locations requires two writes, one of the original data to the snapshot and one of the new data to the original data blocks. Redirect-on-write is similar to copy-on-write except the new data to be written is redirected to be written on the snapshot. That is, the new data is stored in the snapshot while the old data continues to be stored in the original location. As a result, only one write is needed when new data is being written. However, the original copy contains the point in time data of the snapshot and the snapshot contains the new data. As a result, this needs to be reconciled, particularly when the snapshot is being deleted. In addition, as multiple snapshots are created, access to the data becomes more complicated, particularly at reconciliation. Split mirror creates a clone copy of the original data in a separate location. While the creation of the split mirror takes time and more space is required, the clone copy is highly available for separate processing. There are other types of snapshots available including different variations of each snapshot technology described above, each with various advantages and disadvantages depending on the application.
It may be desirable to move a virtual machine from one virtual environment (e.g. hypervisor) to another. For example, an enterprise may wish to move a virtual machine from one cloud implementation to a different cloud implementation. However, each virtual environment may utilize a different hypervisor, a different snapshot technology, and a different configuration system for implementing a virtual hard disk. As a result, any snapshot of a virtual hard disk under one hypervisor will need to be processed through a conversion process including mapping differences between the virtual environments before that virtual disk can be used by the virtual machine under a different hypervisor.