1. Field of the Invention
The invention relates to data replication means and methods. More particularly, the invention relates to an apparatus, system and method for replicating a secondary volume of a mirrored volume pair to a backup volume.
2. Description of the Related Art
It is well known that during operation a CPU may update one or more data storage volumes in an attached storage subsystem. It is further known that replication of data storage volumes is a frequently used strategy for maintaining continuously available information systems in the presence of system level faults or failures. Among several replication techniques, mirroring is often favored over point-in-time copying in that a data mirror is continuously updated and may be quickly substituted for an unavailable primary volume.
Data mirroring involves maintaining identical copies of data on a primary volume and a secondary volume. Volume-to-volume mirroring from a primary volume to a secondary volume may be accomplished either synchronously (in real time) or asynchronously (at selected occasions or intervals). In either case, the primary volume is typically available for use by a host processor and the secondary volume is offline.
Referring to FIG. 1, a prior art peer-to-peer remote copy (PPRC) system 100 is illustrated. The PPRC system 100 is one example of a synchronously mirrored system and includes a primary storage system 110 and a secondary storage system 120. A host 130 is connected to the primary storage system 110. The host 130 stores data by sending write requests to the primary storage system 110.
Data written to primary storage system 110 is copied to the secondary storage system 120, creating a mirror image of the data residing on the primary storage system 110 on the secondary storage system 120. In the PPRC system 100, a write made by the host 130 is considered complete only after the data written to the primary storage system 110 is also written to the secondary storage system 120. The primary host 130 may take various forms, such as a server on a network, a Web server on the Internet, or a mainframe computer. In the depicted examples, the primary storage system 110 and secondary storage system 120 are disk systems.
A communication path 140 connects the host 130 to the primary storage system 110. A communication path 150 connects the primary storage system 110 with the secondary storage system 120. The communication paths 140/150 may comprise various links, such as fiber optic lines, packet switched communication links, enterprise systems connection (ESCON) fibers, small computer system interface (SCSI) cable, and wireless communication links.
The primary storage system 110 includes at least one storage volume 160 typically referred to as a primary volume and other well-known components such as a controller, cache, and non-volatile storage. The secondary storage system 120 includes at least one storage volume 170, typically referred to as a secondary volume. The primary volume 160 and secondary volume 170 are set up in PPRC pairs. PPRC pairs are synchronous mirror sets in which a storage volume in the primary storage system 110 has a corresponding storage volume in the secondary storage system 120 with data that is identical. This pair is referred to as an established PPRC pair or synchronous mirror set.
In operation, each time a write request is sent to the primary volume 160 by the host 130, the primary storage system 110 stores the data on the primary volume 160 and also sends the data over the communication path 150 to the secondary storage system 120. The secondary storage system 120 then copies the data to the secondary volume 170 to form a mirror of the primary volume 160.
FIG. 2 depicts a prior art asynchronously mirrored data system 200 including a host 210, one or more application programs 220, and a data mover 230. A primary storage system 240 is connected to the host 210 by one or more channels, for example, fiber optic channels. At least one primary volume 250 is contained within or connected to the primary storage system 240.
A secondary storage system 260 is connected to the host 210 by one or more channels or alternatively by a communication link. Contained within or connected to the secondary storage system 260 is at least one secondary volume 270. In some systems, a direct communication link may be established between the primary storage system 240 and the secondary storage system 260. In such systems, the data mover 230 may reside within the primary storage system 240.
The asynchronously mirrored data system 200 collects data from the primary storage systems 240 so that all write requests from the host 210 to the primary volume 250 are preserved and applied to the secondary volume 270 without significantly impacting access rates for the host 210. The data and control information transmitted to the secondary storage system 260 is sufficient such that the presence of the primary storage system 240 is no longer required to preserve data integrity.
The application programs 220 generate write requests, which update data on the primary volume 250. The locations of the data updates are tracked by the primary storage system 240. Often, updates to the primary volume 250 are tracked on a track-by-track basis. A two dimensional array of bits (a bit map), often referred to as an active track array or changed track array, is typically used to keep a real-time record of tracks on the primary volume that have been changed since the last synchronization. The changed track array is maintained in the primary storage system 240. The primary storage system 240 may group the updates and conduct a synchronization session to provide the updates to the data mover 230. The updates are transmitted from the data mover 230 to the secondary storage system 260, which writes the updates to the secondary volume 270.
Asynchronous mirroring has minimal impact on the access rate between the primary host 210 and the primary storage system 240 because a subsequent I/O operation may start directly after receiving acknowledgement that data has been written to the primary volume 250. While write requests may occur as demanded by the application programs 220, synchronization of the secondary volume 270 is an independent, asynchronous event. For example, synchronization sessions may be scheduled periodically throughout the day as directed by settings managed by a system administrator, typically several times per hour. Thus, the asynchronous secondary volume 270 may be only rarely identical to the primary volume 250, since additional writes requests to the primary volume 250 may occur during the copy operation necessary to synchronize the secondary volume.
In some systems, both synchronous and asynchronous data mirror pairs are maintained. This configuration permits rapid promotion of a synchronous mirror system to become a replacement primary storage system in the event that the original primary storage system becomes unavailable. The configuration also provides for the maintenance of a nearly real-time remote copy of the primary storage system data for use if the primary site becomes unavailable. In this configuration, the storage volumes on the primary storage system may act as the primary volumes for both the synchronously mirrored volumes and asynchronously mirrored volumes.
In disk mirroring environments, system administrators may desire to create a point-in-time archive or backup copy. In order to minimize the effect on system performance, it is desirable to use the secondary volume as the data source for the copy while allowing the host to access the primary volume in a normal fashion. However, since the secondary volume is an exact copy of the primary volume, the volume identifier is the same on both the primary volume and the secondary volume. The secondary volume cannot be brought online to perform the copy since doing so would introduce duplicate volume identifiers on the system.
In order to backup a mirrored volume pair, the user may bring the secondary volume online to a different system and perform the backup operation on that system. This method eliminates the problem of duplicate volume identifiers. Nevertheless, since multiple systems are required to perform the backup, the solution typically necessitates the purchase of another system.
Alternately, the user may change the volume identifier of the secondary volume, then bring the secondary volume online to the same system as the primary volume and use the renamed secondary volume as the data source for the copy. A disadvantage of this solution is that the backup or archive volume does not have the original secondary volume identifier. During a restore operation, the user is required to remember the original volume identifier of the secondary volume and manually rename the restored volume with the original volume identifier after the restore operation. This procedure is error-prone and often results in system downtime.
Given the aforementioned alternatives, a need exists for an apparatus, method, and system to replicate a secondary volume of a mirrored volume pair including the volume identifier on a backup storage volume. Beneficially, such an apparatus, method, and system would simplify the creation of a point-in-time backup on a mirrored system and decrease the probability of error in restoring the backup.