1. Technical Field
This application relates to computer storage devices, and more particularly to the field of transferring data between storage devices.
2. Description of Related Art
Host processor systems may store and retrieve data using a storage device containing a plurality of host interface units (host adapters), disk drives, and disk interface units (disk adapters). Such storage devices are provided, for example, by EMC Corporation of Hopkinton, Mass. and disclosed in U.S. Pat. No. 5,206,939 to Yanai et al., U.S. Pat. No. 5,778,394 to Galtzur et al., U.S. Pat. No. 5,845,147 to Vishlitzky et al., and U.S. Pat. No. 5,857,208 to Ofek. The host systems access the storage device through a plurality of channels provided therewith. Host systems provide data and access control information through the channels to the storage device and the storage device provides data to the host systems also through the channels. The host systems do not address the disk drives of the storage device directly, but rather, access what appears to the host systems as a plurality of logical disk units. The logical disk units may or may not correspond to the actual disk drives. Allowing multiple host systems to access the single storage device unit allows the host systems to share data stored therein.
In some instances, it may be desirable to copy data from one storage device to another. For example, if a host writes data to a first storage device, it may be desirable to copy that data to a second storage device provided in a different location so that if a disaster occurs that renders the first storage device inoperable, the host (or another host) may resume operation using the data of the second storage device. Such a capability is provided, for example, by the Remote Data Facility (RDF) product provided by EMC Corporation of Hopkinton, Mass. With RDF, a first storage device, denoted the “primary storage device” (or “R1” or “local storage device”) is coupled to the host. One or more other storage devices, called “secondary storage devices” (or “R2” or “remote storage device”) receive copies of the data that is written to the primary storage device by the host. The host interacts directly with the primary storage device, but any data changes made to the primary storage device are automatically provided to the one or more secondary storage devices using RDF. The primary and secondary storage devices may be connected by a data link, such as an ESCON link, a Fibre Channel link, and/or a Gigabit Ethernet link. The RDF functionality may be facilitated with an RDF adapter (RA) provided at each of the storage devices.
RDF allows synchronous data transfer where, after data written from a host to a primary storage device is transferred from the primary storage device to a secondary storage device using RDF, receipt is acknowledged by the secondary storage device to the primary storage device which then provides a write acknowledge back to the host. Thus, in synchronous mode, the host does not receive a write acknowledge from the primary storage device until the RDF transfer to the secondary storage device has been completed and acknowledged by the secondary storage device.
A drawback to the synchronous RDF system is that the latency of each of the write operations is increased by waiting for the acknowledgement of the RDF transfer. This problem is worse when there is a long distance between the primary storage device and the secondary storage device; because of transmission delays, the time delay required for making the RDF transfer and then waiting for an acknowledgement back after the transfer is complete may be unacceptable.
It is also possible to have the host write data to the primary storage device and have the primary storage device copy data to the secondary storage device in an asynchronous background process. The background process cycles through each of the tracks of the primary storage device sequentially and, when it is determined that a particular block has been modified since the last time that block was copied, the block is transferred from the primary storage device to the secondary storage device. Although this mechanism may attenuate the latency problem associated with synchronous data transfer, a difficulty still exists because there can not be a guarantee of data consistency between the primary and secondary storage devices. If there are problems, such as a failure of the primary system, the secondary system may end up with out-of-order changes that make the data unusable.
A proposed solution to this problem is the Symmetrix Automated Replication (SAR) process, which is described in U.S. Pat. No. 7,117,386 titled “SAR RESTART AND GOING HOME PROCEDURES” to LeCrone, et al. and in U.S. Pat. No. 7,024,528 titled “STORAGE AUTOMATED REPLICATION PROCESSING” to LeCrone, et al., both of which are incorporated by reference herein. The SAR uses devices (BCV's) that can mirror standard logical devices. A BCV device can also be split from its standard logical device after being mirrored and can be resynced (i.e., reestablished as a mirror) to the standard logical devices after being split. In addition, a BCV can be remotely mirrored using RDF, in which case the BCV may propagate data changes made thereto (while the BCV is acting as a mirror) to the BCV remote mirror when the BCV is split from the corresponding standard logical device.
However, using the SAR process requires the significant overhead of continuously splitting and resyncing the BCV's. The SAR process also uses host control and management, which relies on the controlling host being operational. In addition, the cycle time for a practical implementation of a SAR process is on the order of twenty to thirty minutes, and thus the amount of data that may be lost when an RDF link and/or primary device fails could be twenty to thirty minutes worth of data.
Another solution is to use SRDF/A data transfer (also sometimes referred to as “virtual ordered writes” where data is accumulated in order-independent chunks at the primary storage device which are transferred, a chunk at a time, to the secondary storage device. Each chunk is provided with a sequence number that is associated with data as the data is transferred from the primary storage device to the secondary storage device. Data that is received by the secondary storage device is not committed (stored) by the secondary storage device unless and until data for the entire chunk has been received from the primary storage device, as indicated by a message provided by the primary storage device to the secondary storage device. The SRDF/A mechanism is described, for example, in U.S. Pat. No. 7,054,883 titled “VIRTUAL ORDERED WRITES FOR MULTIPLE STORAGE DEVICES” to Meiri, et al., which is incorporated by reference herein.
A drawback to SRDF/A is that, if communication between the primary storage device and the secondary storage device is broken, then data for any chunk that was partially sent needs to be discarded by the secondary storage device. After communication is reestablished, data for the entire chunk will need to be resent, including data from the partial chunk that had been previously sent. In some cases, the overhead associated with determining what needs to be sent is significant. Accordingly, it is desirable to provide an SRDF/A system that simplifies the determination of what needs to be resent following a loss and subsequent reestablishment of communication between the primary and secondary storage devices.