Virtualization is being rapidly adopted across the information technology industry. Virtualization generally allows any number of virtual machines to run on a single physical machine, with each virtual machine sharing the resources of that one physical machine. Different virtual machines can run different operating systems and multiple applications on the same physical computer. Virtualization may be implemented by inserting a layer of software directly on the computer hardware in order to provide a virtual machine monitor or “hypervisor” that allocates hardware resources of the physical computer dynamically and transparently. The hypervisor affords an ability for multiple operating systems to run concurrently on a single physical computer and share hardware resources with each other.
Commercially available virtualization software such as VMware® vSphere™ may be used to build complex IT infrastructure distributed across hundreds of interconnected physical computers and storage devices. Such arrangements advantageously avoid the need to assign servers, storage devices or network bandwidth permanently to each application. Instead, the available hardware resources are dynamically allocated when and where they are needed. High priority applications can therefore be allocated the necessary resources without the expense of dedicated hardware used only at peak times.
As IT infrastructure becomes more complex and more widely distributed over larger numbers of physical and virtual machines, coordinating the operation of multiple architectural components becomes increasingly important. A significant deficiency of conventional practice in this area relates to an inability to perform state capture and revert functions for a consistent point-in-time in a coordinated manner for multiple related components at various layers of distributed infrastructure. There are a variety of well-known conventional techniques available that allow state capture and reversion for particular types of components, such as individual virtual machines, individual storage volumes, individual processes running inside an operating system, distributed processes interacting across multiple operating systems, or a set of virtual machines with local storage visible to the hypervisor. However, these techniques are not capable of persisting point-in-time state of a complex asset in an unobtrusive and dynamic manner. For example, such techniques cannot provide accurate state capture and reversion for a complex asset that includes any number of virtual machines as well as one or more associated external storage volumes that are not visible to or controllable by the hypervisor.
An example of a conventional arrangement of the type noted above is the VIOLIN system described in X. Jiang and D. Xu, “VIOLIN: Virtual Internetworking on OverLay INfrastructure,” Purdue University, Laboratory for Research in Emerging Network and Distributed Systems, Jul. 2003. See also A. Kangarlou et al, “Taking snapshots of virtual networked environments,” Proceedings of the 2nd International Workshop on Virtualization Technology in Distributed Computing, Reno, Nev. 2007, and A. Kangarlou et al, “VNsnap: Taking snapshots of virtual networked environments with minimal downtime,” IEEE/IFIP International Conference on Dependable Systems & Networks, 2009 (DSN '09), pp. 524-533. This particular capture and revert arrangement has a number of important drawbacks. For example, it only addresses capturing and reverting state for virtual machines, and therefore does not capture and revert state for external storage that is mounted by virtual machines but not visible to or controllable by the hypervisor. By relying solely on the hypervisor capability for virtual machine state capture and revert, the VIOLIN system is unable to create a consistent point-in-time state for all virtual machines and any associated external storage volumes hosted on dedicated storage platforms. Also, the VIOLIN system requires the use of custom virtual switches, and is therefore not applicable to generic distributed infrastructures.