Virtualization allows multiple virtual machines (VMs) to be run on a single physical machine, with each virtual machine sharing the resources of that one physical computer across multiple environments. Different virtual machines can run different operating systems and multiple applications on the same physical computer. The ability of virtual machines to run on top of different operating systems and related components introduces the possibility that one physical computer might host multiple heterogeneous operating systems and related components. Such heterogeneity is sometimes found over multiple clusters. In some cases disaster regimes are implemented using a first cluster running a first set of operating systems (and related components) and a second cluster running a second set of operating systems (and related components).
Techniques abound to send storage data from one cluster to another cluster (e.g., to implement an offsite storage facility), and techniques exist to send storage data from a first cluster running a first set of operating systems and related components to a second cluster running a second set of operating systems and related components.
Unfortunately, heterogeneous combinations of system components at a source site and system components found at a target system are not always compatible. Components (e.g., storage facilities) found at a source site might be different from components found at a target site, which occurrence introduces the need to facilitate disaster recovery (DR) and rollback between heterogeneous sites. Indeed, the rapid adoption of virtual machines and corresponding systems has brought about the need to provide DR between such heterogeneous distributed storage systems.
The nature of heterogeneous distributed storage systems introduces new problems to be solved, and legacy techniques for implementing DR capabilities (e.g., failover and failback) in homogeneous distributed storage systems fail to provide the desired set of functions in heterogeneous distributed storage system scenarios. Specifically, for example, legacy DR techniques are often unable to perform in environments where a source site operates using a first vendor's storage system, and the target site operates using a second vendor's storage system. In many cases, legacy DR implementations rely on access to the storage I/O (input/output or IO) path at both the source site and the target site to transport data snapshots and/or “deltas” (e.g., changes between snapshots) between the sites to support certain DR operations and/or scenarios.
In heterogeneous situations, the snapshots and/or deltas might conform to different, possibly incompatible formats. Some legacy techniques implement failover from a given source system to a different target system by using a one way delta-based data migration, followed by a conversion from the source data format to the target data format. However, after operating on the target system for a certain time period, the disaster recovery (e.g., restoration or failback) from the target system to the source system might require a full data restoration back to the source system, resulting in an inefficient use of computing and/or storage resources. Some legacy approaches require specialized drivers for the operating systems of the VMs on the storage systems (e.g., to manage delta storage), which can be difficult to manage across the thousands of VMs that might comprise a given site. Further, certain legacy approaches might mandate that the same hypervisor modules (e.g., same vendor, same version, same configuration, etc.) be present on both the source system and the target system. Acceding to such a mandate increases the cost of deploying and maintaining the disaster recovery capability, and also decreases the flexibility of using the distributed storage systems in various use models.
What is needed is a technique or techniques to improve over legacy and/or over other considered approaches. Some of the approaches described in this background section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.