Host processor systems may store and retrieve data using storage devices containing a plurality of host interface units (host adapters), disk drives, and disk interface units (disk adapters). Such storage devices are provided, for example, by EMC Corporation of Hopkinton, Mass. and disclosed in U.S. Pat. No. 5,206,939 to Yanai et al., U.S. Pat. No. 5,778,394 to Galtzur et al., U.S. Pat. No. 5,845,147 to Vishlitzky et al., and U.S. Pat. No. 5,857,208 to Ofek, which are incorporated herein by reference. The host systems access the storage device through a plurality of channels provided therewith. Host systems provide data and access control information through the channels of the storage device and the storage device provides data to the host systems also through the channels. The host systems do not address the disk drives of the storage device directly, but rather, access what appears to the host systems as a plurality of logical volumes. Different sections of the logical volumes may or may not correspond to the actual disk drives.
Information Lifecycle Management (ILM) concerns the management of data throughout the data's lifecycle. The value of data may change over time and, accordingly, the needs for the storage and accessibility of the data may change during the lifecycle of the data. For example, data that is initially accessed often may, over time, become less valuable and the need to access that data become more infrequent. It may not be efficient for such data infrequently accessed to be stored on a fast and expensive storage device. On the other hand, older data may suddenly become more valuable and, where once accessed infrequently, become more frequently accessed. In this case, it may not be efficient for such data to be stored on a slower storage system when data access frequency increases.
In some instances, it may be desirable to copy data from one storage device to another. For example, if a host writes data to a first storage device, it may be desirable to copy that data to a second storage device provided in a different location so that if a disaster occurs that renders the first storage device inoperable, the host (or another host) may resume operation using the data of the second storage device. Such a capability is provided, for example, by a Remote Data Facility (RDF) product provided by EMC Corporation of Hopkinton, Mass., e.g., Symmetrix Remote Data Facility (SRDF). With RDF, a first storage device, denoted the “primary storage device” (or “R1”) is coupled to the host. One or more other storage devices, called “secondary storage devices” (or “R2”) receive copies of the data that is written to the primary storage device by the host. The host interacts directly with the primary storage device, but any data changes made to the primary storage device are automatically provided to the one or more secondary storage devices using RDF. The primary and secondary storage devices may be connected by a data link, such as an ESCON link, a Fibre Channel link, and/or a Gigabit Ethernet link. The RDF functionality may be facilitated with an RDF adapter (RA) provided at each of the storage devices.
There may be a number of different types of RDF transmission. Synchronous RDF mode allows synchronous data transfer where, after an initial data write from a host to a primary storage device, the data is transferred from the primary storage device to a secondary storage device using RDF. Receipt of the data is acknowledged by the secondary storage device to the primary storage device which then provides a write acknowledge back to the host for the initial data write. Another possibility for RDF transmission is to have the host write data to the primary storage device and have the primary storage device copy data asynchronously to the secondary storage device in the background. One product using asynchronous replication techniques is by EMC Corporation and known as SRDF/A in which data sets are transferred to the secondary array at defined intervals. Using SRDF/A, data to be copied from one storage array to another in chunks that are assigned sequence numbers based on when the data was written by the host. For further discussion of SRDF/A systems and techniques, see U.S. Pat. Nos. 7,000,086 to Meiri, et al., entitled “Virtual Ordered Writes,” and 7,054,883 to Meiri, et al. entitled “Virtual Ordered Writes for Multiple Storage Devices,” which are both incorporated herein by reference.
In connection with data replication using RDF systems, one issue that may occur is discrepancies in data storage management between R1 and R2 devices when ILM techniques are used. For example, date that is accessed frequently on an R1 device may be stored and managed at a location on the R1 device that is suitable for the need for frequent access of that data. However, when replicated to the R2 device, that same data, existing as a data backup copy, may not be accessed as frequently. Accordingly, the data on the R2 device, although being a copy of the R1 data, may be stored and managed differently on the R2 device than on the R1 device. In situations of failover to the R2 device, or other uses for the R2 device, the R2 device may not immediately be able to support the workload as the new primary device because the data copy stored thereon may not be stored as efficiently or effectively as on the R1 device. Transferring all information between the R1 and R2 devices during normal operation to maintain the same ILM storage management on each of the devices may not be a practical solution due to the amount of information transfer that this would require, among other reasons.
Accordingly, it would be desirable to provide a system that allows for the efficient management of data in a storage system using data replication techniques among multiple storage devices.