The invention is generally related to data centers, and more particularly to operation of federated data centers with distributed clusters and volumes.
Data centers can be utilized by enterprises to provide a wide variety of services over a network. A data center typically includes clusters of host devices and data storage subsystems. Each data storage subsystem includes at least one storage array having multiple physical storage devices which can be organized as logical volumes. The host devices, which are typically types of servers, may support clustered applications to provide services to the clients by utilizing the data storage capabilities of the storage array. Further, one or more of the host devices may each support multiple virtualized servers (a.k.a. virtual machines or VMs) which run applications.
Various technologies may be implemented to facilitate data center disaster recovery. For example, RAID can be implemented locally within the data center in order to avoid service interruptions in the case of failure of a physical storage device, and remote site mirroring may be implemented so that data is backed up at a different data center which is at a different geographical location to avoid data loss in the case of natural disasters. Further, virtual machines or applications associated with a failed host can failover (restart) on another host in the cluster.
While it is well known for multiple data centers to coordinate in support of disaster recovery operations, until recently the assets of different data centers were not integrated in a manner which supported normal operations. EMC VPLEX differs from such typical prior art systems because it enables federation of information across multiple data centers such that hosts and volumes located in different data centers function as if located in the same data center, at least from the perspective of a client application. Further, such federation is practical even where the distance between the data centers is so great that synchronous write IOs would result in un-acceptable delay because in at least some configurations the system has active/active asynchronous capability, e.g., a volume can be shared by two VMs or physical nodes located at different data centers at distances typically associated with asynchronous topologies.