1. Field of the Invention
This invention relates in general to data storage systems that use redundant data backup, and more particularly to a method, apparatus and program storage device for maintaining data consistency and cache coherency during communications failures between nodes in a remote mirror pair.
2. Description of Related Art
Due to advances in computer technology, there has been an ever-increasing need for data storage in data processing networks. In a typical data processing network, there has been an increase in the number of volumes of data storage and an increase in the number of hosts needing access to the volumes.
Fortunately for computer users, the cost of data storage has continued to decrease at a rate approximating the increase in need for storage. For example, economical and reliable data storage in a data network can be provided by a storage subsystem. However, as people's reliance upon machine readable data increases, they are more vulnerable to damage caused by data loss. Large institutional users of data processing systems which maintain large volumes of data such as banks, insurance companies, and stock market traders must and do take tremendous steps to insure back up data availability in case of a major disaster. These institutions recently have developed a heightened awareness of the importance of data recovery and back-up in view of world events. Consequently, data backup systems have never been more important.
Generally, data backup systems copy a designated group of source data, such as a file, volume, storage device, partition, etc. If the source data is lost, applications can use the backup copy instead of the original, source data. The similarity between the backup copy and the source data may vary, depending upon how often the backup copy is updated to match the source data.
Currently, data processing system users often maintain copies of their valuable data on site on either removable storage media, or in a secondary “mirrored” storage device located on or within the same physical confines of the main storage device. If the backup copy is updated in step with the source data, the copy is said to be a “mirror” of the source data, and is always “consistent” with the source data. Should a disaster such as fire, flood, or inaccessibility to a building occur, however, both the primary as well as the secondary or backed up data will be unavailable to the user. Accordingly, more data processing system users are requiring the remote storage of back up data.
Some competing concerns in data backup systems are cost, speed, and data consistency. Systems that guarantee data consistency often cost more, and operate more slowly. On the other hand, many faster backup systems typically cost less while sacrificing absolute consistency. One conventional technique for recovering backup data involves the maintenance of data in “duplex pairs.” In a duplex pair configuration, each time data is written on a disk or some other storage media, a duplicate copy is written on a backup disk as well.
One example of a data backup system is the Extended Remote Copy (“XRC”) system, sold by International Business Machines Corp (“IBM”). In addition to the usual primary and secondary storage devices, the XRC system uses a “data mover” machine coupled between primary and secondary devices. The data mover performs backup operations by copying data from the primary devices to the secondary devices. Storage operations in the XRC system are “asynchronous,” since primary storage operations are committed to primary storage without regard for whether the corresponding data has been stored in secondary storage.
The secondary devices are guaranteed to be consistent with the state of the primary devices at some specific time in the past. This is because the XRC system time stamps data updates stored in the primary devices, enabling the secondary devices to implement the updates in the same order. Time stamping in the XRC system is done with a timer that is shared among all hosts coupled to primary storage. Since the secondary devices are always consistent with a past state of the primary devices, a limited amount of data is lost if the primary devices fail.
A different data backup system is IBM's Peer-to-Peer Remote Copy (“PPRC”) system. The PPRC approach does not use a data mover machine. Instead, storage controllers of primary storage devices are coupled to controllers of counterpart secondary devices by suitable communications links, such as fiber optic cables. The primary storage devices send updates to their corresponding secondary controllers. With PPRC, a data storage operation does not succeed until updates to both primary and secondary devices complete. In contrast to the asynchronous XRC system, PPRC performs “synchronous” backups.
In many backup systems, recovery involves a common sequence of operations. First, backup data is used to restore user data to a known state, as of a known date and time. Next, “updates” to the primary storage subsystem that have not been transferred to the secondary storage subsystem are copied from the “log” where they are stored at the primary storage subsystem, and applied to the restored data. The logged updates represent data received after the last backup was made to the secondary storage subsystem, and are usually stored in the same chronological order according to when they were received by the primary storage subsystem. After applying the logged updates, the data is considered to be restored, and the user's application program is permitted to access the restored data.
Although many of the foregoing technologies constitute significant advances, and may even enjoy significant commercial success today, engineers are continually seeking to improve the performance and efficiency of today's data backup systems. One area of possible focus concerns remote mirroring. Remote mirroring provides a large amount of additional data protection above and beyond what is available in a standard RAID configuration. This includes remote copies of a users data that can be used at a later point to recover from certain types of failures, including complete loss of a controller pair.
In a remote mirror system, the ability to read and write data from either side of the mirror pair is necessary. By allowing reads and writes from either side simultaneously, the ever-present problem of data integrity becomes an issue. In a distributed memory system of a remote mirror system there are two cases that need to be solved to insure that only the most recent data is presented to the host: the first is the double write problem and the second is the read after write problem.
The double write problem occurs when both sides of the mirror pair write data to the same location at the same time (or very close to the same time). The problem is determining which data is actually the latest. This problem may be solved by overwriting the write that gets processed first with the second write thereby creating a race condition without either of the writes really being the correct one to retain. This problem is best solved at the host level.
The second case is the read after write problem. A distributed memory architecture of a remote mirror system leads to the read after write problem because of latencies between when a write occurs in one memory and when the write is reflected into the other, remote, memory. This problem is compounded in the event of a communications failure between the remote pair. Further, with remote mirroring, the link may disappear and reappear at anytime without either of the controller pairs having actually failed. Nevertheless, during a link outage, both controller pairs continue to operate normally. Thus, the problem becomes one of recovery of the locking information and insuring that the correct data ends up on the two mirror volume sets when the link is recovered. Because both controller pairs may be operating properly, all data may not reside on the same volume set. Thus, the data that exists on two separate physical entities (two physically separate disk sets) must be synchronized.
It can also be seen that there is a need for a method, apparatus and program storage device for maintaining data consistency and cache coherency during communications failures between nodes in a remote mirror pair.