1. Field of the Invention
This invention relates to digital data storage systems. More particularly, the invention concerns the resynchronization of backup storage to primary storage, ensuring that any updates received during resynchronization are applied in the proper order relative to resynchronization data.
2. Description of the Related Art
In this information era, there is more data than ever to transmit, receive, analyze, and process. Another key data management function is data storage. Most applications demand data storage that is fast, reliable, and convenient. Data storage is especially critical in certain data-intensive businesses. Some examples include automated teller networks and other banking applications, telephone directory information services, investment fund management, and the like.
In many of these businesses, the high cost of data loss warrants maintaining a duplicate copy of the data. Thus, if the primary data is lost, corrupted, or otherwise unavailable, business can seamlessly continue by using the backup data instead of the primary data. One technique for performing data backups is xe2x80x9cremote copy,xe2x80x9d a technique that is implemented in various backup storage systems of International Business Machines Corp. (IBM). With remote copy, changes to data on a primary site are shadowed to a secondary site. The secondary site therefore mirrors or xe2x80x9cshadowsxe2x80x9d the primary site. Each site, for example, may include a storage controller and one or more storage devices. Normally, remote copy is implemented by a separate processing machine called a xe2x80x9cdata mover,xe2x80x9d coupled to both primary and secondary sites.
If the shadowing stops for some reason, the data on the primary and secondary sites is no longer the same. Shadowing may stop for various reasons, such as interruption of primary/secondary communications, errors occurring at the secondary site, etc. After the problem is corrected, shadowing resumes under a xe2x80x9crestartxe2x80x9d procedure. At this point, primary data that was changed (xe2x80x9cupdatedxe2x80x9d) during the shadowing interruption must be copied from the primary site to the secondary site, thereby bring the secondary site up to date. This process is called xe2x80x9cresynchronization.xe2x80x9d
At first glance, resynchronization appears to be a simple procedure. The unshadowed changes to the primary site are simply copied over to the secondary site. In practice, resynchronization is more complicated because data storage is actually a dynamic process, and further updates to the primary site often occur during resynchronization. Furthermore, this problem is compounded because the updating and resynchronization processes both occur asynchronously. Accordingly, one danger is that old updates are copied to the secondary site, overwriting more recent data copied during resynchronization. Another danger is that resynchronization data is applied to the secondary site overwriting newer data already copied during the update process.
If resynchronization is performed improperly, the consequences can be severe. Data may be corrupted or lost, resulting in failed read and write operations. In extreme cases, a read operation might even recall the wrong data.
The foregoing conditions are worsened because of the data mover""s independence from the host computers writing new data to the primary site. This arrangement is an advantage in one sense, because the hosts can continually write to the primary site in spite of any interruption in data mirroring. Critical storage-related host functions therefore continue without a hitch. However, this makes the data mover""s job even more difficult, because data updates to the primary site arrive continually.
Consequently, due to certain unsolved problems such as those discussed above, known resynchronization procedures are not entirely adequate for all purposes.
Broadly, the present invention concerns the resynchronization of backup storage to primary storage, ensuring that any updates received during resynchronization are applied in the proper order relative to resynchronization data. The invention is applied in a data storage system having primary and backup storage each coupled to a data mover. Under normal operations, the data mover mirrors data stored on the primary storage upon the backup storage.
In some cases, error conditions arise preventing proper mirroring of data from the primary site to the backup storage. These conditions include failure of the backup storage, communications failure between the data mover and backup storage, etc. In these situations, the data mover stores any data records received by the storage system in the primary storage without mirroring the data records to the backup storage. The data mover also identifies the tracks that these data records are on in an update map.
When the error condition ends, the data mover performs a static resynchronization process, which begins by accessing the update map to identify a group of tracks containing new data records received during the error condition. The data mover reads these tracks, and then proceeds to write these read tracks to the backup storage. The data mover also makes an entry in a progress queue, this entry including (1) a group-ID identifying the tracks written to backup storage and (2) a read time-stamp (xe2x80x9cRTxe2x80x9d) identifying the time when the data mover read these tracks from primary storage. The process of identifying, reading, and writing tracks continues until all tracks in the update have been processed.
Whenever the storage system receives new data records (xe2x80x9cupdatesxe2x80x9d), this invokes a dynamic resynchronization process. Advantageously, this process may occur simultaneously with the static resynchronization process, serving to accurately process updates despite ongoing static resynchronization. First, the dynamic resynchronization process determines whether the static resynchronization process is ongoing. If not, the updates are written to primary storage and the data mover mirrors the written updates to backup storage, as in normal circumstances.
However, if static resynchronization is underway, the dynamic resynchronization process determines whether the update is already identified in the update map. If not, this record is not the subject of static resynchronization, and it can be immediately written to backup storage.
On the other hand, if the current update is already represented in the update map, care is needed to ensure that the dynamic and static resynchronization process occur in the proper relative order, to avoid writing older data over newer data. Accordingly, a determination is first made whether (1) the update corresponds to any of the tracks present in the progress queue. If not, there is a danger that the static and dynamic resynchronization process might apply their data in the wrong order. In this event, dynamic resynchronization waits until the data record is shown in the progress queue.
Once the update is represented in the queue, a comparison is made between the data record""s read time-stamp and its write time-stamp. The write time-stamp shows when a host originally sent the data record to the primary controller for writing. If the write time-stamp is earlier than the read time-stamp, the update will already be included in the static resynchronization. On the other hand, if the write time-stamp is later than the read time-stamp, this is a new update not included in static resynchranization; thus, the dynamic resynchronization process applies it to the backup storage.
Accordingly, one embodiment of the invention involves a method to resynchronize backup storage to primary storage, ensuring that any updates received during resynchronization are applied in the proper order relative to resynchronization data. In another embodiment, the invention may be implemented to provide an apparatus, such as a data storage system, programmed to resynchronize backup storage to primary storage, ensuring that any updates received during resynchronization are applied in the proper order relative to resynchronization data. In still another embodiment, the invention may be implemented to provide a signal-bearing medium tangibly embodying a program of machine-readable instructions executable by a digital data processing apparatus to perform method steps for resynchronizing backup storage to primary storage, ensuring that any updates received during resynchronization are applied in the proper order relative to resynchronization data.
The invention affords its users with a number of distinct advantages. Chiefly, the invention preserves data integrity by maintaining the order of storage operations, despite the receipt of data updates during resynchronization. This helps avoid overwriting newer data with older data. Additionally, the invention helps preserve the smooth storage of data from the user""s perspective, despite temporary unavailability of backup storage. The invention also provides a number of other advantages and benefits, which should be apparent from the following description of the invention.