Shared storage represents local storage on a single site shared amongst compute servers on the site, while distributed storage represents local storage on multiple sites that is assembled together by storage virtualization hardware and software so that it can be shared amongst compute servers on multiple sites. It is differentiated from replicated storage by the fact that all copies are active and can be written to simultaneously, where replicated storage represents local storage on multiple sites (typically exactly two sites) that is configured as an active/standby pair where only the active member can be modified and where there is replication hardware and/or software that acts to keep the standby member in sync with the active member.
Distributed storage is a key feature in large-scale multi-site storage systems that allows multiple different data centers to all access network storage media as if it is local storage, and without the need to duplicate files in their individual computers. Such a storage device typically has multiple ports or means to identify and track multiple sessions in a single port. Cloud computing networks use virtual data centers comprising large numbers of virtual machines (VMs) utilize virtual storage through server virtualization products, such as VMware's vSphere. This allows the system to store virtual machine disk images, including snapshots, and multiple servers can read/write the same file system simultaneously while individual virtual machine files are locked.
In present replicated storage systems, the migration of virtual machines between two virtual data centers typically requires that workloads on the virtual machines are inaccessible during the migration process. This represents an active/passive data storage system in which the data may be accessible to both sides of two data centers through the shared storage, however data written by one data center is not accessible to the other data center until a certain period of time after the data is written (asynchronous replication) and such time as the passive copy is declared ‘active’ and the active copy is declared ‘passive’ and the direction of replication between the two sites is reversed.
Continuous-availability (CA) architectures improve the high-availability model by allowing the data to be available at all times. Thus, a CA system can migrate active workloads running between data centers without any downtime. This has enabled the advent of active/active storage presentation for a single active LUN (logical unit number) in a read/write state that exists in two separate locations. In an active/active system, storage is shared by on-line and running servers at two locations on separate physical storage arrays. The VPLEX Metro cluster provided by EMC Corporation represents an active/active architecture that stretches simultaneous read/write access to storage across sites. This system implements a distributed storage volume to servers at multiple sites to effectively make the operating system (OS) or Hypervisor layer believe that the active/active volumes distributed across the sites are local volumes. These distributed volumes can then be utilized to enable assembling compute servers from multiple sites into a single “stretched” cluster such that the compute servers on both sites are members of a single virtualized compute cluster without any restrictions in regards to physical site boundaries. Virtualization technology thus allows the server OS to treat all cluster members as if they exist at the same site. VPLEX, and other similar local and distributed storage federation platform systems allows data stored on storage arrays to be accessed and shared locally, or within, between and across data centers over local and metro distances. This creates a CA solution where applications and data are always available through data and application movements, disasters and data migrations.
Present cluster sizes supported by active/active systems are generally limited to metropolitan-scale distances, such as for the EMC VPLEX Metro platform. In this case, distances supported by the stretched cluster architecture are limited to an order of 100 kilometers or so. This is sufficient for many intra-regional data center distributions, but is not adequate for more typical inter-regional distributions, such as when data centers are physically located in different states or different countries, or between any two data centers that are farther than 100 kilometers apart. In such cases, data migration between the two distant data centers must revert back to the active/passive architecture to migrate data over these distances.
What is needed, therefore, is a way to implement active/active data migration for virtual data centers over very long distances, such as in excess of 100 kilometers.
The subject matter discussed in the background section should not be assumed to be prior art merely as a result of its mention in the background section. Similarly, a problem mentioned in the background section or associated with the subject matter of the background section should not be assumed to have been previously recognized in the prior art. The subject matter in the background section merely represents different approaches, which in and of themselves may also be inventions. EMC VPLEX, and VPLEX Metro are trademarks of EMC Corporation; vSphere Metro Storage Cluster, and vMotion are trademarks of VMware Corporation.