1. Field of the Invention
The present invention relates to a method, system, and program for migrating data to a replacement storage using a multi-storage volume swap.
2. Description of Related Art
Data backup systems can provide continuous availability of production data in the event of a sudden catastrophic failure at a single point in time or data loss over a period of time. In one such disaster recovery system, production data is replicated from a local site to a remote which may be separated geographically by several miles from the local site. Such dual, mirror or shadow copies are typically made in a secondary storage device at the remote site, as the application system is writing new data to a primary storage device usually located at the local site. Different data replication technologies may be used for maintaining remote copies of data at a secondary site, such as International Business Machine Corporation's (“IBM”) Metro Mirror Peer to Peer Remote Copy (PPRC), Extended Remote Copy (XRC), Coupled XRC (CXRC), Global Copy, and Global Mirror Copy.
In data mirroring systems, data is typically maintained in volume pairs, comprising a primary volume in a primary storage device and a corresponding secondary volume in a secondary storage device that includes an identical copy of the data maintained in the primary volume. The primary and secondary volumes are identified by a copy relationship in which the data of the primary volume, also referred to as the source volume, is copied to the secondary volume, also referred to as the target volume. Primary and secondary storage controllers may be used to control access to the primary and secondary storage devices. A source may have multiple targets in a multi-target configuration.
Tivoli Productivity Center for Replication is an example of an application that customers may use to manage planned and unplanned outages. The Tivoli Productivity Center for Replication application can detect failures at the primary storage system which may be at a local site, for example. Such failures may include a problem writing or accessing primary storage volumes at the local site. When the Tivoli Productivity Center for Replication recovery application detects that a failure has occurred, it can invoke a multi-storage volume swapping function, an example of which is the IBM HyperSwap® function. This function may be used to automatically swap processing for all volumes in the mirrored configuration from the local site to the remote site. As a consequence of the swap, the storage volumes at the remote site which were originally configured as the secondary volumes of the original copy relationship, are reconfigured as the primary volumes of a new copy relationship. Similarly, the storage volumes at the local site which were originally configured as the primary volumes of the original copy relationship, may be reconfigured as the secondary volumes of the new copy relationship, once the volumes at the local site are operational again.
In connection with the swapping function, a failover function may be invoked. In the Tivoli Productivity Center for Replication recovery application, the failover function can in some instances, obviate performing a full copy when re-establishing data replication in the opposite direction, that is, from the remote site back to the local site. More specifically, the failover processing resets or reconfigures the remote storage devices (which were originally configured as the secondary storage devices) to be the primary storage devices which are placed in a “suspended” status pending resumption of the mirroring operation but in the opposite direction. In the meantime, the failover processing starts change recording for any subsequent data updates made by the host to the remote site.
Once the local site is operational, failback processing may be invoked to reset the storage devices at the local site (which were originally configured as the primary storage devices) to be the secondary storage devices. Mirroring may then be resumed (but in the opposite direction, that is remote to local rather than local to remote) to resynchronize the secondary storage devices (originally the primary storage devices) at the local site to the data updates being stored at the primary storage devices (originally the secondary storage devices) at the remote site. Once data synchronization is complete, the HyperSwap® return operation can reestablish paths from the storage systems at the local site which were the original primary storage systems to the storage systems at the remote site which were the original secondary storage systems, and finish the failback operation to restore the storage devices at the local site as the primary storage devices of the pair. Mirroring may then be resumed (but in the original direction, that is local to remote rather than remote to local) to synchronize the secondary storage devices (that is, the original secondary storage devices) at the remote site to the data updates being stored at the primary storage devices (that is, the original primary storage devices) at the local site. Again, a full recopy of storage devices may be avoided.
In various situations, it may be appropriate to switch one or more volumes of the primary or source storage to corresponding volumes of a different source storage without significantly impacting the users' production work. For example, the user may wish to migrate the source storage to a new storage system, or to a different storage system, in order to improve overall performance or for reconfiguration purposes. FIG. 1 shows an example of a typical data replication session in which data on volumes 10 of a first storage control unit 20a, is being replicated on corresponding volumes 10 of a second data storage control unit 20b in an ongoing data replication process represented by arrows 30. In addition, the data on the volumes 10 of the first storage control unit 20a, is being migrated to corresponding volumes 10 of a third data storage control unit 20c in an ongoing data migration process represented by arrows 40.
Various products are available for migrating data from an existing storage system to a new storage system with little or no disruption to ongoing input/output (I/O) operations or to a disaster recovery capability which may be provided in case of a failure over the course of the data migration. Examples of such data migration products include TDMF (Transparent Data Migration Facility) by IBM Corporation or FDRPAS by Innovation Data Processing. However, if the volumes 10 of the storage control units 20a and 20b are part of an existing storage replication session such as the data replication process 30, for example, volumes 10 of a fourth control unit, such as storage control unit 20d have typically been provided in order to assure that failover capability is maintained.
Thus, in a typical migration process in which data stored on volumes 10 of storage control unit 20a are migrated from storage control unit 20a to corresponding volumes 10 of the storage control unit 20c, a fourth storage control unit 20d is typically provided, and a data replication process as indicated by the arrows 50 is started before the migration process 40, to replicate the data which may initially be stored on the storage control unit 20c, to the storage control unit 20d. The initial portion of the data replication process 50 includes configuring the volumes 10 of the storage control unit 20d to correspond to the volumes 10 of the storage control unit 20c which in turn have been configured to correspond to the volumes 10 of the storage control unit 20a, the source of the data to be migrated.
Thus, the overall migration process typically includes a wait for the two new storage control units 20c and 20d to reach full duplex status, that is, the configurations of the storage volumes 10 of the storage control unit 20c of the copy relationships of the storage control units 20c and 20d, have been replicated in the volumes 10 of the storage control unit 20d, and the data on those configured volumes 10 are identical to the initial data stored on the storage control unit 10c. At this point of the overall process, the migration process 40 is typically started using a migration product such as TDMF or FDRPAS, for example. The migration product will start copying data from storage control unit 20a to storage control unit 20c. Once data migration product has copied most of the data from storage control unit 20a to storage control unit 20c, it quiesces I/O to storage control unit 20a, copies the remaining changes (data writes) to storage control unit 20a, from storage control unit 20a to storage control unit 20c, and then swaps I/O requests to go to storage control unit 20c. 
A data replication process such as the data replication process 30 may frequently involve many copy relationship pairs, often numbering in the thousands. Hence, in a typical data migration, a relatively smaller number of source volumes of the control unit 20a are selected at a time for migration to the new storage control unit 20c. Accordingly, the copy relationship pairs of those source volumes of the storage control unit 20a for migration are typically first added manually to the existing replication session represented by the process 50. The replication session process 50 is started (or restarted) and a wait is incurred until the added copy relationship pairs reach full duplex status in the replication process 50. Once full duplex status has been achieved for the added copy relationship pairs, the migration process 40 is started (or restarted) for the selected source volumes of the control unit 20a and another wait is typically incurred for the migration product to swap the selected volumes 10 from storage control unit 20a to the storage control unit 20c. Once that swap is complete, the selected copy relationship pairs for the volumes 10 in the storage control unit 20a and the storage control unit 20b are removed from the replication process 30 and their relationships terminated. This process is repeated until all source volumes of the storage control unit 20a to be migrated have been selected and processed as described above.
Data may also be migrated to a new storage system without using a data migration product such as such as TDMF or FDRPAS, for example. However, such data migration processes may result in interruptions to ongoing data replication processes or disaster recovery capabilities. One example of such a migration process may include selecting the source volumes 10 of the storage control unit 10 to be migrated to the new storage control unit 20c and first manually removing from the replication process 30, the copy relationship pairs for the selected volumes 10 in storage control unit 20a and the corresponding volumes 10 of the storage control unit 20b and terminating those relationship pairs. New copy relationship pairs corresponding to the terminated copy relationship pairs may then be manually reestablished between the new source volumes 10 in the storage control unit 20c and the original target volumes 10 in the storage control unit 20b. 
In order to maintain the integrity of the consistency groups of the original replication process 30, these operations would typically be done outside of the scope of the replication process 30, and then added in to a new replication process between the storage control unit 20c and the original target storage control unit 20b once the volumes 10 of the new storage control unit 20c and the original target volumes 10 of the original target storage control unit 20b reach full duplex. Consequently, the user may be exposed to a system outage due to a storage failure while waiting for the migration process to complete.