1. Field of the Invention
The present invention relates to backup data storage. More particularly, the invention concerns a digital data storage system using a universal timer to perform asynchronous peer-to-peer data mirroring where primary and secondary controllers cooperatively perform periodic consistency checks according to the universal timer.
2. Description of the Related Art
In this information age, there is more data than ever to transmit, receive, process, and store. And, as people's reliance upon machine readable data increases, they are more vulnerable to damage caused by data loss. Consequently, data backup systems have never been more important.
Generally, data backup systems copy a designated group of source data, such as a file, volume, storage device, partition, etc. If the source data is lost, applications can use the backup copy instead of the original, source data. The similarity between the backup copy and the source data may vary, depending upon how often the backup copy is updated to match the source data. If the backup copy is updated in step with the source data, the copy is said to be a "mirror" of the source data, and is always "consistent" with the source data.
Some competing concerns in data backup systems are cost, speed, and data consistency. Systems that guarantee data consistency often cost more, and operate more slowly. On the other hand, many faster backup systems typically cost less while sacrificing absolute consistency.
One example of a data backup system is the Extended Remote Copy ("XRC") system, sold by International Business Machines Corp. In addition to the usual primary and backup storage devices, the XRC system uses a "data mover" machine coupled between primary and backup devices. The data mover performs backup operations by copying data from the primary devices to the secondary devices. Storage operations in the XRC system are "asynchronous," since primary storage operations are committed to primary storage without regard for whether the corresponding data has been stored in secondary storage.
The secondary device is guaranteed to be consistent with the state of the primary device at some specific time in the past. This is because the XRC system time stamps data updates stored in the primary devices, enabling the secondary devices to implement the updates in the same order. Time stamping in the XRC system is done with a timer that is shared among the hosts coupled to primary storage. As an example, the common timer may comprise an IBM Sysplex Timer, P/N 9037-002. Since the secondary device is always consistent with a past state of the primary device, a limited amount of data is lost if the primary device fails.
A different data backup system is IBM's Peer-to-Peer Remote Copy ("PPRC") system. The PPRC approach does not use a data mover machine. Instead, storage controllers of primary storage devices are coupled to controllers of counterpart secondary devices by suitable communications links, such as fiber optic cables. The primary storage devices send updates to their corresponding secondary controllers. With PPRC, a data storage operation does not succeed until updates to both primary and secondary devices complete. In contrast to the asynchronous XRC system, PPRC performs "synchronous" backups.
Although these systems constitute a significant advance and enjoy widespread commercial success today, the assignee of the present application has continually sought to improve the performance and efficiency of these and other backup systems. Some possible drawbacks of the XRC system include the expense of the data mover, and the lack of complete currency between primary and secondary data storage. Furthermore, any failure of the central data mover is particularly problematic, since this single component is the focal point for all backup operations. In contrast, the PPRC system avoids the expense of the data mover, and the primary and secondary storage devices are completely consistent. However, data backup operations are more time consuming with the PPRC system, since they are synchronous. Moreover, backups take even longer when there is more distance between primary and secondary storage, due to propagation delays in the communications link connecting primary and secondary controllers. For example, fiber optic coupling exhibits a propagation delay of about five microseconds per kilometer of fiber. Electrically conductive materials exhibit a propagation delay of about one nanosecond per foot. Moreover, this propagation delay is doubled for communications in which the primary and backup systems must send and then acknowledge messages.
Consequently, known storage backup systems are not completely adequate for some applications due to certain unsolved problems.