In current storage networks, and particularly storage networks including geographically distributed directors (or nodes) and storage resources, preserving or reducing bandwidth between resources and directors while providing optimized data availability and access is highly desirable. Data access may be localized, in part, to improve access speed to pages requested by host devices. Caching pages at directors provides localization, however, it is desirable that the cached data be kept coherent with respect to modifications at other directors that may be caching the same data. An example of a system for providing distributed cache coherence is described in U.S. Pat. No. 7,975,018 to Unrau et al., entitled “Systems and Methods for Providing Distributed Cache Coherency,” which is incorporated herein by reference. Other systems and techniques for managing and sharing storage array functions among multiple storage groups in a storage network are described, for example, in U.S. Pat. No. 7,266,706 to Brown et al. entitled “Methods and Systems for Implementing Shared Disk Array Management Functions,” which is incorporated herein by reference.
Data transfer among storage devices, including transfers for data replication or mirroring functions, may involve various data synchronization processing and techniques to provide reliable protection copies of data among a source site and a destination site. In synchronous transfers, data may be transmitted to a remote site and an acknowledgement of a successful write is transmitted synchronously with the completion thereof. In asynchronous transfers, a data transfer process may be initiated and a data write may be acknowledged before the data is actually transferred to directors at the remote site. Asynchronous transfers may occur in connection with sites located geographically distant from each other. Asynchronous distances may be distances in which asynchronous transfers are used because synchronous transfers would take more time than is preferable or desired. Particularly for asynchronous transfers, it is desirable to maintain a proper ordering of writes such that any errors or failures that occur during data transfer may be properly identified and addressed such that, for example, incomplete data writes be reversed or rolled back to a consistent data state as necessary.
Reference is made, for example, to U.S. Pat. No. 7,475,207 to Bromling et al. entitled “Maintaining Write Order Fidelity on a Multi-Writer System,” which is incorporated herein by reference, that discusses features for maintaining write order fidelity (WOF) in an active/active system in which a plurality of directors (i.e. controllers and/or access nodes) at geographically separate sites can concurrently read and/or write data in a distributed data system. Discussions of data ordering techniques for synchronous and asynchronous data replication processing for other types of systems, including types of remote data facility (RDF) systems produced by EMC Corporation of Hopkinton, Mass., may be found, for example, in U.S. Pat. No. 7,613,890 to Meiri, entitled “Consistent Replication Across Multiple Storage Devices,” U.S. Pat. No. 7,054,883 to Meiri et al., entitled “Virtual Ordered Writes for Multiple Storage Devices,” and U.S. Pat. No. 8,335,899 to Meiri et al., entitled “Active/Active Remote Synchronous Mirroring,” which are all incorporated herein by reference.
In some instances, it is desirable to provide a point-in-time image of a logical volume. An example of a logical point-in-time image of the volume may be a data storage snapshot copy that may be obtained relatively quickly and without significant overhead by creating a data structure initially containing pointers that point to sections of the logical volume. A data storage snapshot does not replicate a full copy of the data set (referred to as a production data set). Rather, the data storage snapshot only stores differences between a current version of the production data set and the version of the data set at the point in time when the snapshot was taken. There are many different specific mechanisms for providing snapshot copies, see, for example, U.S. Pat. No. 7,340,489 to Vishlitzky, et al., entitled “Virtual Storage Devices,” and U.S. Pat. No. 6,792,518 to Armangau et al., entitled “Data Storage System Having Mata [Meta] Bit Maps for Indicating Whether Data Blocks are Invalid in Snapshot Copies,” which are both incorporated by reference herein. It is noted that although the term “snapshot” is principally used herein, the system described herein applies to any appropriate point-in-time image.
In a virtualized environment, a centralized management infrastructure, that may be referred to as a virtual center, may provide a central point of control for managing, monitoring, provisioning and migrating virtual machines. A virtual machine (VM) is a software implementation of a machine that executes programs like a physical machine. Virtualization software allows multiple VMs with separate operating systems to run in isolation on the same physical server. Each VM may have its own set of virtual hardware (e.g., RAM, CPU, NIC, etc.) upon which an operating system and applications are loaded. The operating system may see a consistent, normalized set of hardware regardless of the actual physical hardware components. The virtual center may operate to control virtual machines in data centers and, for example, in connection with cloud computing. The virtual center may further include a virtual data center that provides logical control and management of data storage in a data center, and provides for sub-dividing contents of virtual components into compute resources, network resources and storage resources.
Configuring and deploying VMs is known in the field of computer science. For example, U.S. Pat. No. 7,577,722 to Khandekar, et al., entitled “Provisioning of Computer Systems Using Virtual Machines,” which is incorporated herein by reference, discloses techniques for configuring and deploying a VM according to user specifications. VMs may be provisioned with respect to any appropriate resource, including, for example, storage resources, CPU processing resources and/or memory. Operations of VMs may include using virtual machine images. A VM image may be a point-in-time image or snapshot of the state of the virtual machine as it resides in the host's memory. The VM image may be obtained for an operating VM and transferred to another location where the VM continues execution from the state defined by the virtual machine image. In this way, the VM image may be a snapshot (a VM snapshot) of an execution state of a program by a VM that may be moved between different locations and processing thereafter continued without interruption. Reference is made to U.S. Pat. No. 8,667,490 B1 to van der Goot, entitled “Active/Active Storage and Virtual Machine Mobility Over Asynchronous Distances,” which is incorporated herein by reference.
Continuous snapshotting (CS) refers to a process of taking snapshots of any content change in a storage system. In connection with the content being user data, the process may be referred to as continuous data protection (CDP). In a CS/CDP implementation, individual writes to storage are duplicated and stored in a log of activity in one or more journal devices. By replaying these writes in reverse, storage may be “rolled back” (a roll-back) to any past state which was covered by the logs. This may be done on production storage, or in a duplicate copy of the storage to avoid disruption to users of the production storage. In the latter case, when access to historic data is no longer required, the log may be replayed again in forward order (a roll-forward) to restore the duplicate to the production state and possibly including logged writes that occurred since roll-back. An example of a product that provides continuous data protection with multiple recovery points to restore applications instantly to a specific point in time is RecoverPoint by EMC Corporation of Hopkinton, Mass.
Content protected by point-in-time images, such as snapshots, e.g. in connection with continuous snapshotting techniques, may be extended to include not only user data but further include configuration metadata, and/or other appropriate configuration information of the state of an associated VM (VM snapshots). The VM snapshots may include information used for configuration volumes, storage devices, consistency groups and/or other appropriate storage management system elements, as further discussed elsewhere herein. A user may want to roll-back a storage management system to a past state of a VM due to performance or stability issues attributed to configuration changes.
For further discussion of techniques for providing continuous data protection, reference is made, for example, to U.S. Pat. No. 8,046,545 to Meiri et al., entitled “Continuous Backup,” which discloses a system for providing continuous backup of a storage device and restoring the storage device to prior states; U.S. Pat. No. 7,558,926 to Oliveira et al., entitled “Continuous Data Backup Using Distributed Journaling,” which discloses techniques for providing continuous data backups of primary storage using distributed journals; and U.S. Pat. No. 7,840,595 to Blitzer et al., entitled “Techniques for Determining An Implemented Data Protection Policy,” which discloses features of determining a data protection method in accordance with a facility and replication type associated with each of one or more selected recovery points of one or more storage objects. The above-noted references are incorporated herein by reference.
Users of storage management systems may make use of snapshot products, and/or other point-in-time data copy products, to establish a line of “history” for all the user data that flows through the system and/or for VM snapshots for past states of one or more associated VMs. Among multiple storage sites, it may be advantageous to have consistent snapshots across the multiple sites. Techniques may be provided for consistent snapshots that include creating a snapshot only on one site of the cluster and/or suspending I/Os on all the sites of the cluster and creating snapshots simultaneously on all the storage arrays. These techniques, however, may suffer certain disadvantages in terms of timeliness of snapshot consistency and/or in delays of system operations.
Accordingly, it would be desirable to provide a system that addresses the above-noted problems and efficiently and effectively provides for creating consistent cluster-wide snapshots in a virtualized environment.