1. Field of the Invention
The present invention relates to quickly synchronizing selected data, such as a single file, from two or more versions of the data stored on different storage volumes.
2. Description of the Related Art
Information drives business. A disaster affecting a data center can cause days or even weeks of unplanned downtime and data loss that could threaten an organization's productivity. For businesses that increasingly depend on data and information for their day-to-day operations, this unplanned downtime can also hurt their reputations and bottom lines. Businesses are becoming increasingly aware of these costs and are taking measures to plan for and recover from disasters.
Often these measures include protecting primary, or production, data, which is ‘live’ data used for operation of the business. Copies of primary data on different physical storage devices, and often at remote locations, are made to ensure that a version of the primary data is consistently and continuously available. Typical uses of copies of primary data include backup, Decision Support Systems (DSS) data extraction and reports, testing, and trial failover (i.e., testing failure of hardware or software and resuming operations of the hardware or software on a second set of hardware or software). These copies of data are preferably updated as often as possible so that the copies can be used in the event that primary data are corrupted, lost, or otherwise need to be restored.
Two areas of concern when a hardware or software failure occurs, as well as during the subsequent recovery, are preventing data loss and maintaining data consistency between primary and backup data storage areas. One simple strategy includes backing up data onto a storage medium such as a tape, with copies stored in an offsite vault. Duplicate copies of backup tapes may be stored onsite and offsite. However, recovering data from backup tapes requires sequentially reading the tapes. Recovering large amounts of data can take weeks or even months, which can be unacceptable in today's 24×7 business environment.
More robust, but more complex, solutions include mirroring data from a primary data storage area to a backup, or “mirror,” storage area in real-time as updates are made to the primary data. FIG. 1A provides an example of a storage environment 100 in which data 110 are mirrored. Computer system 102 processes instructions or transactions to perform updates, such as update 104A, to data 110 residing on data storage area 112.
A data storage area may take form as one or more physical devices, such as one or more dynamic or static random access storage devices, one or more magnetic or optical data storage disks, or one or more other types of storage devices. With respect to backup copies of primary data, preferably the storage devices of a volume are direct access storage devices such as disks rather than sequential access storage devices such as tapes.
In FIG. 1A, two mirrors of data 110 are maintained, and corresponding updates are made to mirrors 120A and 120B when an update, such as update 104A, is made to data 110. For example, update 104B is made to mirror 120A residing on mirror data storage area 122, and corresponding update 104C is made to mirror 120B residing on mirror data storage area 124 when update 104A is made to data 110. Each mirror should reside on a separate physical storage device from the data for which the mirror serves as a backup, and therefore, data storage areas 112, 122, and 124 correspond to three physical storage devices in this example.
A snapshot of data can be made by “detaching” a mirror of the data so that the mirror is no longer being updated. FIG. 11B shows storage environment 100 after detaching mirror 120B. Detached mirror 120B serves as a snapshot of data 110 as it appeared at the point in time that mirror 120B was detached. When another update 106A is made to data 110, a corresponding update 106B is made to mirror 120A. However, no update is made to detached mirror 120B.
Saving backup copies or snapshots on mirrored direct access storage devices, rather than on sequential access storage devices, helps to speed synchronization of a snapshot with the data from which the snapshot was made. However, copying all data from snapshots can be unacceptably time-consuming when dealing with very large volumes of data, such as terabytes of data. Copying only individual files is possible using file copying utilities such as xcopy, but these utilities do not operate on selected portions of a file. For example, if only one bit has changed in a file containing one gigabyte of data, then a file copy utility must copy the entire gigabyte of data to capture the change, which is also very time consuming. A faster way to restore and/or synchronize selected data from large volumes of data and/or files is needed.
One solution to the problem of restoring data from a snapshot is to save the changes made to the data after the snapshot was taken. Those changes can then be applied in either direction. For example, the changes can be applied to the snapshot when there is a need for the snapshot to reflect the current state of the data. For example, referring back to FIG. 1B, after update 106A is made to data 110, detached mirror (snapshot) 120B is no longer “synchronized” with data 110. To be synchronized with data 110, detached mirror (snapshot) 120B can be updated by applying the change made in update 106A.
Alternatively, to return to a previous state of the data before update 106A was made, the changed portion of data 110 can be restored from (copied from) detached mirror (snapshot) 120B. The change made in update 106A is thereby “backed out” without copying all of the data from the snapshot.
Saving the actual changes made to very large volumes of data can be problematic, however, introducing additional storage requirements. To save physical disk space, changes can be stored in temporary data storage areas such as volatile memory, but those changes are vulnerable to computer system, hardware, and software failures. In addition, storing the changes in temporary data storage areas typically requires that the snapshot and the data are stored in a common physical storage area that can be accessed by a common volatile memory. A requirement that the snapshot and the data be stored in a common physical data storage area can limit the number of snapshots that can be made of the data in organizations having limited resources or a very large amount of data.
Without storing all of the actual changes to data, one solution is to keep track of regions in each storage area that have changed with respect to regions of another storage area storing a copy of the data. One way to keep track of changed regions is to use bitmaps, also referred to herein as data change maps or maps, with the storage areas (volumes) divided into regions and each bit in the bitmap corresponding to a particular region of the storage area (volume). Each bit is set to logical 1 (one) if a change to the data in the respective region has been made with respect to a snapshot of the data. If the data have not changed since the snapshot was made, the respective bit is set to logical 0 (zero).
FIG. 2 shows an example of primary data at two points in time, where primary data 210A represents the primary data as it appeared at time A and primary data 210B represents the primary data as it appeared at time B (time B being later than time A). Also shown is a corresponding data change map 220 at time B showing eight regions of the primary data for explanation purposes. As shown in data change map 220, the primary data in regions 2, 3, and 7 changed between times A and B. Assume that a snapshot of the data is taken at time A. If the primary data are later corrupted, then the primary data can be restored back to the state of the data at the time the snapshot was taken. This restoration can be accomplished by copying regions 2, 3, and 7 (identified as the regions having a value of 1 in the data change map) from the snapshot to the primary data. Alternatively, to bring the snapshot up to date, regions 2, 3, and 7 can be copied from the primary data 210B at time B to the snapshot. This solution enables the two copies of the data to be synchronized without copying all data (such as all data in a very large file) from one set of data to the other.
However, this form of data change tracking operates upon regions of the storage volume rather than on logical organizations of the data, such as a selected file. All changed regions of the storage volumes are synchronized using the data change map described above. Because portions of a selected file may be scattered among multiple regions on the storage volume, the data change tracking solution does not provide for selectively synchronizing changed portions of a logical set of data, such as changed portions of a single file, on different volumes.
Such a limitation becomes problematic when very large files are involved. For example, assume that only one of a set of twenty large files on the volume is corrupted. Using the data change map described above, all changed regions containing portions of any of the twenty large files are synchronized. Furthermore, changes made to files that were not corrupted are “backed out” unnecessarily, and those files are unavailable for use during synchronization. For example, if the files contain databases, all databases stored in the changed regions of the volume would be unavailable during the time required to synchronize the data. These databases would have to be taken offline, brought back online, and logs of transactions occurring during the time the databases were offline would need to be applied to each database. Additional processing of files that are not corrupted greatly slows the synchronization process and wastes resources.
What is needed is the ability to synchronize only selected data, such as changed portions of a single file or other logical set of data, from two or more versions of the data stored in different storage areas. Preferably, the solution should enable the selected data to be synchronized with a snapshot of the data stored in different storage areas without copying all of the data. The solution should have minimal impact on performance of applications using the data having one or more snapshots. The solution should enable other data stored in the storage areas to remain available for use and to retain changes made if the other data are not part of the selected data being synchronized.