1. Field of the Invention
This invention relates to digital data storage systems. More particularly, the invention concerns the resynchronization of backup storage to primary storage, ensuring that any updates received during resynchronization are applied in the proper order relative to resynchronization data.
2. Description of the Related Art
In this information era, there is more data than ever to transmit, receive, analyze, and process. Another key data management function is data storage. Most applications demand data storage that is fast, reliable, and convenient. Data storage is especially critical in certain data-intensive businesses. Some examples include automated teller networks and other banking applications, telephone directory information services, investment fund management, and the like.
In many of these businesses, the high cost of data loss warrants maintaining a duplicate copy of the data. Thus, if the primary data is lost, corrupted, or otherwise unavailable, business can seamlessly continue by using the backup data instead of the primary data. One technique for performing data backups is "remote copy," a technique that is implemented in various backup storage systems of International Business Machines Corp. (IBM). With remote copy, changes to data on a primary site are shadowed to a secondary site. The secondary site therefore mirrors or "shadows" the primary site. Each site, for example, may include a storage controller and one or more storage devices. Normally, remote copy is implemented by a separate processing machine called a "data mover," coupled to both primary and secondary sites.
If the shadowing stops for some reason, the data on the primary and secondary sites is no longer the same. Shadowing may stop for various reasons, such as interruption of primary/secondary communications, errors occurring at the secondary site, etc. After the problem is corrected, shadowing resumes under a "restart" procedure. At this point, primary data that was changed ("updated") during the shadowing interruption must be copied from the primary site to the secondary site, thereby bring the secondary site up to date. This process is called "resynchronization."
At first glance, resynchronization appears to be a simple procedure. The un-shadowed changes to the primary site are simply copied over to the secondary site. In practice, resynchronization is more complicated because data storage is actually a dynamic process, and further updates to the primary site often occur during resynchronization. Furthermore, this problem is compounded because the updating and resynchronization processes both occur asynchronously. Accordingly, one danger is that old updates are copied to the secondary site, overwriting more recent data copied during resynchronization. Another danger is that resynchronization data is applied to the secondary site overwriting newer data already copied during the update process.
If resynchronization is performed improperly, the consequences can be severe. Data may be corrupted or lost, resulting in failed read and write operations. In extreme cases, a read operation might even recall the wrong data.
The foregoing conditions are worsened because of the data mover's independence from the host computers writing new data to the primary site. This arrangement is an advantage in one sense, because the hosts can continually write to the primary site in spite of any interruption in data mirroring. Critical storage-related host functions therefore continue without a hitch. However, this makes the data mover's job even more difficult, because data updates to the primary site arrive continually.
Consequently, due to certain unsolved problems such as those discussed above, known resynchronization procedures are not entirely adequate for all purposes.