Information drives business. A disaster affecting a data center can cause days or even weeks of unplanned downtime and data loss that could threaten an organization's productivity. For businesses that increasingly depend on data and information for their day-to-day operations, this unplanned downtime can also hurt their reputations and bottom lines. Businesses are becoming increasingly aware of these costs and are taking measures to plan for and recover from disasters. Often these measures include protecting primary, or production, data, which is ‘live’ data used for operation of the business. Copies of primary data on different physical storage devices, and often at remote locations, are made to ensure that a version of the primary data is consistently and continuously available. Typical uses of copies of primary data include backup, Decision Support Systems (DSS) data extraction and reports, testing, and trial failover (i.e., testing failure of hardware or software and resuming operations of the hardware or software on a second set of hardware or software). These copies of data are preferably updated as often as possible so that the copies can be used in the event that primary data are corrupted, lost, or otherwise need to be restored.
Two areas of concern when a hardware or software failure occurs, as well as during the subsequent recovery, are preventing data loss and maintaining data consistency between primary and backup data storage areas. One simple strategy includes backing up data onto a storage medium such as a tape, with copies stored in an offsite vault. Duplicate copies of backup tapes may be stored onsite and offsite. However, recovering data from backup tapes requires sequentially reading the tapes. Recovering large amounts of data can take weeks or even months, which can be unacceptable in today's 24×7 business environment.
More robust, but more complex, solutions include mirroring data from a primary data storage area to a backup, or “mirror,” storage area in real-time as updates are made to the primary data. FIG. 1A provides an example of a storage environment 100 in which data 110 are mirrored. Computer system 102 processes instructions or transactions to perform updates, such as update 104A, to data 110 residing on data storage area 112.
A data storage area may take form as one or more physical devices, such as one or more dynamic or static random access storage devices, one or more magnetic or optical data storage disks, or one or more other types of storage devices. With respect to backup copies of primary data, preferably the storage devices of a volume are direct access storage devices such as disks rather than sequential access storage devices such as tapes.
In FIG. 1A, two mirrors of data 110 are maintained, and corresponding updates are made to mirrors 120A and 120B when an update, such as update 104A, is made to data 110. For example, update 104B is made to mirror 120A residing on mirror data storage area 122, and corresponding update 104C is made to mirror 120B residing on mirror data storage area 124 when update 104A is made to data 110. As mentioned earlier, each mirror should reside on a separate physical storage device from the data for which the mirror serves as a backup, and therefore, data storage areas 112, 122, and 124 correspond to three physical storage devices in this example.
A snapshot of data can be made by “detaching” a mirror of the data so that the mirror is no longer being updated. FIG. 1B shows storage environment 100 after detaching mirror 120B. Detached mirror 120B serves as a snapshot of data 110 as it appeared at the point in time that mirror 120B was detached. When another update 106A is made to data 110, a corresponding update 106B is made to mirror 120A. However, no update is made to detached mirror 120B.
Saving backup copies or snapshots on mirrored direct access storage devices, rather than on sequential access storage devices, helps to speed synchronization of a snapshot with the data from which the snapshot was made. However, copying all data from snapshots can be unacceptably time-consuming when dealing with very large volumes of data, such as terabytes of data. A faster way to restore and/or synchronize large volumes of data is needed.
One solution to the problem of restoring data from a snapshot is to save the changes made to the data after the snapshot was taken. Those changes can then be applied in either direction. For example, the changes can be applied to the snapshot when there is a need for the snapshot to reflect the current state of the data. For example, referring back to FIG. 1B, after update 106A is made to data 110, detached mirror (snapshot) 120B is no longer “synchronized” with data 110. To be synchronized with data 110, detached mirror (snapshot) 120B can be updated by applying the change made in update 106A.
Alternatively, to return to a previous state of the data before update 106A was made, the changed portion of data 110 can be restored from (copied from) detached mirror (snapshot) 120B. The change made in update 106A is thereby “backed out” without copying all of the data from the snapshot.
Saving the actual changes made to very large volumes of data can be problematic, however, introducing additional storage requirements. To save physical disk space, changes can be stored in temporary data storage areas such as volatile memory, but those changes are vulnerable to computer system, hardware, and software failures. In addition, storing the changes in temporary data storage areas typically requires that the snapshot and the data are stored in a common physical storage area that can be accessed by a common volatile memory. A requirement that the snapshot and the data be stored in a common physical data storage area can limit the number of snapshots that can be made of the data in organizations having limited resources or a very large amount of data. Furthermore, many applications suffer severe performance problems when more than one snapshot of a set of data is made due to the overhead involved in writing the data to multiple places.
What is needed is the ability to quickly synchronize a snapshot with data from which the snapshot was taken to provide continuous access to critical data. Preferably, the solution should enable data to be synchronized with a snapshot of the data without copying all of the data. Changes to the data should survive computer system, hardware and software failures and require minimal storage space. The solution should have minimal impact on performance of applications using the data having one or more snapshots. In addition, preferably the solution enables snapshot data to be located separately and independently of the location of the primary data.