Host processor systems may store and retrieve data using a storage device containing a plurality of host interface units (host adapters), disk drives, and disk interface units (disk adapters). Such storage devices are provided, for example, by EMC Corporation of Hopkinton, Mass. and disclosed in U.S. Pat. No. 5,206,939 to Yanai et al., U.S. Pat. No. 5,778,394 to Galtzur et al., U.S. Pat. No. 5,845,147 to Vishlitzky et al., and U.S. Pat. No. 5,857,208 to Ofek, which are incorporated herein by reference. The host systems access the storage device through a plurality of channels provided therewith. Host systems provide data and access control information through the channels to the storage device and the storage device provides data to the host systems also through the channels. The host systems do not address the disk drives of the storage device directly, but rather, access what appears to the host systems as a plurality of logical disk units. The logical disk units may or may not correspond to the actual disk drives. Allowing multiple host systems to access the single storage device unit allows the host systems to share data stored therein.
In some instances, it may be desirable to copy data from one storage device to another. For example, if a host writes data to a first storage device, it may be desirable to copy that data to a second storage device provided in a different location so that if a disaster occurs that renders the first storage device inoperable, the host (or another host) may resume operation using the data of the second storage device. Such a capability is provided, for example, by the Remote Data Facility (RDF) product provided by EMC Corporation of Hopkinton, Mass. With RDF, a first storage device, denoted the “primary storage device” (or “R1”) is coupled to the host. One or more other storage devices, called “secondary storage devices” (or “R2”) receive copies of the data that is written to the primary storage device by the host. The host interacts directly with the primary storage device, but any data changes made to the primary storage device are automatically provided to the one or more secondary storage devices using RDF. The primary and secondary storage devices may be connected by a data link, such as an ESCON link, a Fibre Channel link, and/or a Gigabit Ethernet link. The RDF functionality may be facilitated with an RDF adapter (RA) provided at each of the storage devices.
RDF allows synchronous data transfer where, after data written from a host to a primary storage device is transferred from the primary storage device to a secondary storage device using RDF, receipt is acknowledged by the secondary storage device to the primary storage device which then provides a write acknowledge back to the host. Thus, in synchronous mode, the host does not receive a write acknowledge from the primary storage device until the RDF transfer to the secondary storage device has been completed and acknowledged by the secondary storage device. A drawback to the synchronous RDF system is that the latency of each of the write operations is increased by waiting for the acknowledgement of the RDF transfer. This problem is worse when there is a long distance between the primary storage device and the secondary storage device; because of transmission delays, the time delay required for making the RDF transfer and then waiting for an acknowledgement back after the transfer is complete may be unacceptable.
One way to address the latency issue is to have the host write data to the primary storage device in asynchronous mode and have the primary storage device copy data to the secondary storage device in the background, for example, as provided by a Symmetrix Remote Data Facility/Asynchronous (SRDF/A) product by EMC Corporation. The background copy involves cycling through each of the tracks of the primary storage device sequentially and, when it is determined that a particular block has been modified since the last time that block was copied, the block is transferred from the primary storage device to the secondary storage device. Although this mechanism may attenuate the latency problem associated with synchronous and semi-synchronous data transfer modes, a difficulty still exists because there cannot be a guarantee of data consistency between the primary and secondary storage devices. If there are problems, such as a failure of the primary system, the secondary system may end up with out-of-order changes that make the data unusable.
A solution to this is disclosed in U.S. Pat. No. 7,054,883 to Meiri et al., entitled “Virtual Ordered Writes for Multiple Storage Devices” (the '883 patent), which is incorporated herein by reference. The '883 patent describes an asynchronous data replication technique where data is initially accumulated in chunks at the source. A write by a device (host) is acknowledged when the written data is placed in the chunk. The chunks are stored in the cache of the source storage device(s) prior to being transmitted to the destination storage device(s). The system described in the '883 patent has the advantage of maintaining appropriate ordering of write operations (by placing data in chunks) while still providing the host with a relatively quick acknowledgement of writes.
In instances where the host(s) occasionally write data faster than the data can be transferred from the R1 device to the R2 device and/or faster than the data can be saved at the R2 device, data may accumulate in the cache of the R1 device and/or the R2 device. When this occurs, the excess data in the cache may be handled using a spillover mechanism, such as that disclosed in U.S. Pat. No. 7,624,229 to Longinov, entitled “Spillover Slot” (the '229 patent), which is incorporated by reference herein and which is assigned to EMC Corporation, the assignee of the present application. However, if the cache becomes full or close to full, notwithstanding the presence (or not) of a spillover mechanism the asynchronous transfer operation may be terminated. This may occur even though the overall rate at which data is written may be much less than rate at which data is transferred/saved. That is, a burst of data writes may cause the cache to overflow even though the system, on average, is capable of handling the rate at which data is written.
Write pacing techniques may be used to control the pacing of host data writes to a storage system in response to changing system conditions, including burst conditions. For example, in a storage system having multiple storage devices, e.g., R1 and R2 volumes in an EMC Symmetrix and/or other RDF product, host data writes may be paced according to various system conditions, such as a backlog of data at the R1 or R2 volumes and/or an amount of lag time at one or more of the volumes between receiving the write request and servicing the write request. Write pacing techniques are further discussed elsewhere herein and reference is made to, for example, U.S. Pat. No. 7,702,871 to Arnon et al., entitled “Write pacing,” which is incorporated herein by reference.
Copy operations of a storage system may include generating point-in-time or snapshot that provide stored copies of data at various points in time. For example, the remote volume (R2) of an RDF product may store snapshot copies that are accessible to a user. Products for generating and storing point-in-time snapshot copies of data are produced by EMC Corporation, such as EMC SNAP, and reference is made, for example, to U.S. Pat. No. 7,113,945 to Moreshet et al., entitled “Virtual storage device that uses volatile memory,” and U.S. Pat. No. 7,340,489 to Vishlitzky et al.; entitled “Virtual storage devices,” which are all incorporated herein by reference.
In some circumstances, it may be desirable to minimize and/or otherwise optimize occurrences of write pacing for a host. In such situations, it may be advantageous to know the effect of events and/or operations, such as snapshot copy operations, on the storage system in connection with assessing an impact of write pacing that may occur as a result of the event and/or operation. Accordingly, it would be desirable to provide a tool and/or other system that simulates write pacing in connection with a requested operation for a storage system, such as a snap copy operation, to allow an analysis of the effect of an event on the storage system and any resulting write pacing operations.