1. Field of the Invention
The present invention relates to a method, system, and program for mirroring data between primary and secondary sites.
2. Description of the Related Art
In typical disaster recovery solutions, data is housed at a primary site as well as at one or more secondary sites. These secondary sites maintain a synchronized copy of the data such that no data is lost in the case of a disaster at the primary site. If a disaster occurs, processing is either “failed-over” to one of the secondary sites or the data is copied from the secondary site back to the primary site. In order for disaster recovery to be effective, the secondary sites are typically geographically distant, i.e., in different cities, states, etc., from the primary site so that both sites are not affected by the same disaster.
Disaster recovery systems typically address two types of failures, a sudden catastrophic failure at a single point in time or data loss over a period of time. In the second type of gradual disaster, updates to volumes may be lost. For either type of failure, a copy of data may be available at a remote location. Such dual or shadow copies are typically made as the application system is writing new data to a primary storage device at a primary site.
In mirroring backup systems, data is maintained in volume pairs. A volume pair is comprised of a volume in a primary storage device and a corresponding volume in a secondary storage device that includes a consistent copy of the data maintained in the primary volume. Typically, the primary volume of the pair will be maintained in a primary storage control unit, and the secondary volume of the pair is maintained in a secondary storage control unit at a different physical location than the primary storage control unit. A storage control unit is a physical hardware unit that consists of a storage server integrated with one or more storage devices to provide storage capability to a host computer. A storage server is a physical unit that provides an interface between one or more storage devices and a host computer by providing the function of one or more logical subsystems. The storage server may provide functions that are not provided by the storage device. The storage server is composed of one or more clusters of storage devices. A primary storage control unit may be provided to control access to the primary storage and a secondary storage control unit may be provided to control access to the secondary storage.
When two geographically dispersed server farms are used to remotely mirror data for disaster recovery capability, there arises the performance problem of reestablishing mirroring after one of the sites has been down and now recovered. In such cases, the bulk of data between the two sites is identical with a small portion that has been changed at one site and not the other during the period of one site being down. Historically, to reestablish synchronization between the sites (reestablish mirroring) one site is chosen to be considered current and then all the data is copied to the other site. Due to the amount of data to move, this mirroring operation is a very time consuming process (on the order of weeks). State of the art practice to reduce this time is to take advantage of the fact that most of the data between the two sites is the same and a hash value is calculated on corresponding segments of data. If the hash values are identical, then the data can be assumed to be the same and does not need to be copied. This method greatly reduces the time for the resynchronization from the order of weeks to a day.
There is a need in the art for continued improvements to the resynchronization process.