1. Field
Embodiments of the invention relate to producing tertiary instant virtual copies without volume suspension.
2. Description of the Related Art
Computing systems often include one or more host computers (“hosts”) for processing data and running application programs, Direct Access Storage Device (DASDs) for storing data, and a storage controller for controlling the transfer of data between the hosts and the DASD. Storage controllers may also be referred to as control units or storage directors.
Disaster recovery systems typically address two types of failures, a sudden catastrophic failure at a single point in time or data loss over a period of time. In the second type of gradual disaster, updates to volumes (e.g., Logical Unit Numbers, Logical Devices, etc.) that store data may be lost. To assist in recovery of data updates, a copy of data may be provided at a remote location. Such dual or shadow copies (also referred to as secondary copies) are typically made as the application system is writing new data to a primary storage device. In some systems, a third copy (i.e., a tertiary copy) is maintained (e.g., in case the shadow copy is corrupted). International Business Machines Corporation (IBM), the assignee of the subject patent application, provides backup systems for maintaining remote copies of data at a secondary storage device, including extended remote copy (XRC®) and peer-to-peer remote copy (PPRC®).
These backup systems provide a technique for recovering data updates between a last, safe backup and a system failure. Such data shadowing systems can also provide an additional remote copy for non-recovery purposes, such as local access at a remote site.
In such backup systems, data is maintained in volume pairs. A volume pair is comprised of a volume in a primary storage device and a corresponding volume in a secondary storage device that includes a consistent copy of the data maintained in the primary volume. Typically, the primary volume of the pair will be maintained in a primary direct access storage device (DASD) and the secondary volume of the pair is maintained in a secondary DASD shadowing the data on the primary DASD. A primary storage controller may be provided to control access to the primary DASD and a secondary storage controller may be provided to control access to the secondary DASD.
In the IBM XRC® backup system, the application system writing data to the primary volumes includes a sysplex timer which provides a time-of-day (TOD) value as a time stamp to data writes. The application system time stamps data sets when writing such data sets to volumes in the primary DASD. The integrity of data updates is related to insuring that updates are done at the secondary volumes in the volume pair in the same order as they were done on the primary volume. In the XRC® backup system, the time stamp provided by the application program determines the logical sequence of data updates. In many application programs, such as database systems, certain writes cannot occur unless a previous write occurred; otherwise the data integrity would be jeopardized. Such a data write whose integrity is dependent on the occurrence of previous data writes is known as a dependent write. For instance, if a customer opens an account, deposits $400, and then withdraws $300, the withdrawal update to the system is dependent on the occurrence of the other writes, the opening of the account and the deposit. When such dependent transactions are copied from the primary volumes to secondary volumes, the transaction order must be maintained to maintain the integrity of the dependent write operation.
Volumes in the primary and secondary DASDs are consistent when all writes have been transferred in their logical order, i.e., all dependent writes transferred first before the writes dependent thereon. In the banking example, this means that the deposit is written to the secondary volume before the withdrawal. A consistency group is a collection of related volumes that need to be kept in a consistent state. A consistency transaction set is a collection of updates to the primary volumes such that dependent writes are secured in a consistent manner. For instance, in the banking example, in order to maintain consistency, the withdrawal transaction needs to be in the same consistent transactions set as the deposit or in a later consistent transactions set; the withdrawal cannot be in an earlier consistent transactions set. Consistency groups maintain data consistency across volumes. For instance, if a failure occurs, the deposit will be written to the secondary volume before the withdrawal. Thus, when data is recovered from the secondary volumes, the recovered data will be consistent.
A consistency time is a time the system derives from the application system's time stamp to the data set. A consistency group has a consistency time for all data writes in a consistency group having a time stamp equal or earlier than the consistency time stamp. In the IBM XRC® backup system, the consistency time is the latest time to which the system guarantees that updates to the secondary volumes are consistent. As long as the application program is writing data to the primary volume, the consistency time increases. However, if update activity ceases, then the consistency time does not change as there are no data sets with time stamps to provide a time reference for further consistency groups. If all the records in the consistency group are written to secondary volumes, then the reported consistency time reflects the latest time stamp of all records in the consistency group. Techniques for maintaining the sequential consistency of data writes and forming consistency groups to maintain sequential consistency in the transfer of data between a primary DASD and secondary DASD are described in U.S. Pat. Nos. 5,615,329 and 5,504,861, which are assigned to International Business Machines Corporation, the assignee of the subject patent application, and which are incorporated herein by reference in their entirety.
A number of DASD subsystems are capable of performing “instant virtual copy” operations, also referred to as “fast replicate functions.” Instant virtual copy operations work by modifying metadata such as relationship tables or pointers to treat a primary data object as both the original and copy. In response to a host's copy request, the storage subsystem immediately reports creation of the copy without having made any physical copy of the data. Only a “virtual” copy has been created, and the absence of an additional physical copy is completely unknown to the host.
Later, when the storage system receives updates to the original or copy, the updates are stored separately and cross-referenced to the updated data object only. At this point, the original and copy data objects begin to diverge. The initial benefit is that the instant virtual copy occurs almost instantaneously, completing much faster than a normal physical copy operation. This frees the host and storage subsystem to perform other tasks. The host or storage subsystem may even proceed to create an actual, physical copy of the original data object during background processing, or at another time.
One such instant virtual copy operation is known as a FlashCopy® operation. A FlashCopy® operation involves establishing a logical point-in-time relationship between primary and secondary volumes on the same or different devices. The FlashCopy® operation guarantees that until a track in a FlashCopy® relationship has been hardened to its location on the secondary disk, the track resides on the primary disk. A relationship table is used to maintain information on all existing FlashCopy® relationships in the subsystem. During the establish phase of a FlashCopy® relationship, one entry is recorded in the primary and secondary relationship tables for the primary and secondary that participate in the FlashCopy® being established. Each added entry maintains all the required information concerning the FlashCopy® relationship. Both entries for the relationship are removed from the relationship tables when all FlashCopy® tracks from the primary extent have been physically copied to the secondary extents or when a withdraw command is received. In certain cases, even though all tracks have been copied from the primary extent to the secondary extent, the relationship persists.
The secondary relationship table further includes a bitmap that identifies which tracks involved in the FlashCopy® relationship have not yet been copied over and are thus protected tracks. Each track in the secondary device is represented by one bit in the bitmap. The secondary bit is set when the corresponding track is established as a secondary track of a FlashCopy® relationship. The secondary bit is reset when the corresponding track has been copied from the primary location and destaged to the secondary device due to writes on the primary or the secondary device, or a background copy task.
Further details of the FlashCopy® operations are described in commonly assigned U.S. Pat. No. 6,611,901, which is incorporated herein by reference in its entirety.
Once the logical relationship is established, hosts may then have immediate access to data on the primary and secondary volumes, and the data may be copied as part of a background operation. A read to a track that is a secondary in a FlashCopy® relationship and not in cache triggers a stage intercept, which causes the primary track corresponding to the requested secondary track to be staged to secondary cache at the secondary storage device when the primary track has not yet been copied over and before access is provided to the track from the secondary cache. This ensures that the secondary has the copy from the primary that existed at the point-in-time of the FlashCopy® operation. Further, any destages to tracks on the primary device that have not been copied over triggers a destage intercept, which causes the tracks on the primary device to be copied to the secondary device.
Instant virtual copy techniques have been developed, at least in part, to quickly create a duplicate copy of data without interrupting or slowing foreground processes. Instant virtual copy techniques, such as a FlashCopy® operation, provide a point-in-time copy tool.
In disaster recovery scenarios, the data restored from the secondary storage device needs to be consistent to provide value to a customer. Currently, users of IBM XRC® backup systems suspend mirroring of volumes (i.e., stop applying updates from the primary copy to the secondary (shadow) copy) in order to create a consistent tertiary copy of data for disaster recovery tests and other purposes. The process of suspending the mirroring, performing a Flashcopy® operation from the secondary copy to the tertiary copy, and resynchronization of the secondary copy with the primary copy may take a long time to complete. During this period, the secondary copies do not contain a usable copy of the data, and, should a disaster occur, recovery would have to be done with older data in the tertiary copies. The age of the tertiary data will likely exceed the user's recovery point objective (i.e., the amount of acceptable data loss after disaster recovery).
Thus, there is a need for producing tertiary instant virtual copies without volume suspension.