1. Field of the Invention
The present invention relates to a method, system, and program for using virtual copies in a failover and failback environment.
2. Description of the Related Art
In typical disaster recovery solutions, data is housed at a primary site as well as at one or more secondary sites. These secondary sites maintain a synchronized copy of the data such that a minimum of data is lost in the case of a disaster at the primary site. If a disaster occurs, processing is either “failed-over” to one of the secondary sites or the data is copied from the secondary site back to the primary site. In order for disaster recovery to be effective, the secondary sites are typically geographically distant, i.e., in different cities, states, etc., from the primary site so that both sites are not affected by the same disaster.
Disaster recovery systems typically address two types of failures, a sudden catastrophic failure at a single point in time or data loss over a period of time. In the second type of gradual disaster, updates to volumes may be lost. For either type of failure, a copy of data may be available at a remote location. Such dual or shadow copies are typically made as the application system is writing new data to a primary storage device at a primary site.
In mirroring backup systems, data is maintained in volume pairs. A volume pair is comprised of a volume in a primary storage device and a corresponding volume in a secondary storage device that includes a consistent copy of the data maintained in the primary volume. Typically, the primary volume of the pair will be maintained in a primary storage control unit, and the secondary volume of the pair is maintained in a secondary storage control unit at a different physical location than the primary storage control unit. A storage control unit is a physical hardware unit that consists of a storage server integrated with one or more storage devices to provide storage capability to a host computer. A storage server is a physical unit that provides an interface between one or more storage devices and a host computer by providing the function of one or more logical subsystems. The storage server may provide functions that are not provided by the storage device. The storage server is composed of one or more clusters of storage devices. A primary storage control unit may be provided to control access to the primary storage and a secondary storage control unit may be provided to control access to the secondary storage.
When two geographically dispersed server farms are used to remotely mirror data for disaster recovery capability, there arises the performance problem of reestablishing mirroring after one of the sites has been down and now recovered. In such cases, the bulk of data between the two sites is identical with a small portion that has been changed at one site and not the other during the period of one site being down. Historically, to reestablish synchronization between the sites (reestablish mirroring) one site is chosen to be considered current and then all the data is copied to the other site. Due to the amount of data to move, this mirroring operation is a very time consuming process (on the order of weeks).
Further, while maintaining a mirror copy at a secondary site, the consumer may want to make a virtual copy of the secondary mirror copy to a secondary virtual copy and then run production off the secondary virtual copy site to test and practice on the virtual copy to test the operations of the secondary site.
In certain mirroring implementations, one may have secondary volumes at a secondary site mirroring data at primary volumes at a primary site. In such case, during failure, the secondary volumes are used for production and operations, and changes are recorded during the failover to the secondary volumes. During recovery at the primary volumes, a failback is performed to copy only the changes to the secondary volumes after the failover to the primary volumes. In implementation, the user may create a virtual copy of the secondary volumes and practice on the virtual copies of the secondary volumes and still use the secondary volumes for recovery purposes. In such case, after recovery, the updates are still recovered from the main secondary volumes notwithstanding the virtual copy of the secondary volumes. This implementation requires that the recovery site has two configurations one for practice and one for recovery. Such configuration adds to complexity and increases probably of introducing errors in the event that a recovery operation is required.
In a still further mirroring implementation, to recover from the virtual copy secondary volume, the user may copy over the entire virtual copy of the secondary volume to the primary volume.
There is a need in the art for continued improvements to the failure and recovery process between primary and secondary sites.