In view of the business significance of stored data, IT organizations face a challenge to provide data protection and data recovery with highest data integrity. Two primary techniques enabling data recovery are mirroring technology and snapshot technology. In the event of a system failure (e.g. hardware failure, corrupting event, etc.), data recovery allows an enterprise to recover data from a prior point in time and to consistently resume operations. The problems of identifying and implementing points in time when the stored data is known to be consistent have been recognized in the Prior Art and various systems have been developed to provide a solution as, for example:
U.S. Pat. No. 5,341,493 (Yanai et al.) discloses a disk storage system including means for assuring completion of data writes to a data storage disk upon failure of electrical power received data write commands including data to be written to one or more storage disks. Temporary memory storage temporarily stores the write commands including data to be written to one or more storage disks. The system generates a disk write pending indicator associated with each datum to be written to the one or more storage disks, for indicating that the data stored in the temporary memory must be written to disk. A disk director searches the temporary memory storage for data stored in the temporary memory storage that must be written to disk as indicated by the associated data write pending indicator, for writing the data to the one or more storage disks. Also included are means for providing electrical power to the one or more storage disks, the temporary memory storage and the disk director upon the failure of main electrical power, for assuring completion of write commands stored in the temporary memory storage to one or more storage disks upon the failure of main electrical power.
U.S. Pat. No. 7,644,300 (Rao) discloses a method for re-synchronizing a first copy of data on a first storage system from a second copy of the data on a second storage system, which includes, at a regular checkpoint interval, the first storage system pushing data in its cache that were modified prior to a checkpoint time to its nonvolatile storage and saving the checkpoint time to its nonvolatile storage. The method further includes, at a regular snapshot interval greater than the checkpoint interval, the second storage system taking snapshots of the second copy of the data. When the first storage system has an uncontrolled shutdown, the second storage system determines the snapshot closest in time to the last checkpoint time and sends the difference between the last snapshot and the second copy of the data to the first storage system to recover data lost during the uncontrolled shutdown.
US Patent Application No. 2004/010663 (Prabhu) discloses a method for conducting check-pointing within a write-back cache having a cache memory with at least two memory banks. In one embodiment, a first pointer is set to indicate which cache entry of the at least two memory banks contains current data. A second pointer is set to indicate which cache entry of the at least two memory banks contains checkpoint data. Check-pointing is performed by selectively controlling said second pointer or said first pointer.
US Patent Application No. 2005/138283 (Gamey) discloses a method comprising preserving the coherency of a disk cache within a system external to a disk drive by sequentially writing dirty cache lines comprising said disk cache to a sequential region on said disk drive upon indication of a shutdown of said system; and subsequently restoring the coherency of said disk cache by sequentially reading previously written dirty cache lines from said sequential region on said disk drive.
US Patent Application No. 2005/228942 (Nichols et al.) discloses a method for returning a logical volume which is part of a redundant data storage system to on-line status following a disk failure within the logical volume during the time when another of that volume's disks is unavailable as a result of having its firmware updated, as an example. Data, which would otherwise be changed in the logical volume due to host write requests, is directed to a logging facility within the data storage system, but outside of the logical volume undergoing upgrade.
US Patent Application No. 2005/251625 (Nagae et al.) discloses a data processing system which controls, from the database management system on a host computer, the storage device subsystem which stores log data supplied from the database management system; allocates on a disk cache in the storage device subsystem in advance a log-dedicated buffer area of a size equal to that of the log data output between checkpoints; writes log data into the buffer area; and, in the event of a host computer failure, reads out the log data from the disk cache without making access to a disk device. Since the log information required for the recovery of the data processing device is cached on the storage device side, the time it takes to read the necessary log information can be shortened, which in turn reduces the system recovery time.
US Patent Application No. 2006/047925 (Perry et al.) discloses a technology facilitating recovery from storage-related failures by check-pointing copy-on-write operation sequences. An operation sequence incorporating such checkpoints into a copy-on-write can include the following: receive a write request that identifies payload data to be written to a first data store, read original data associated with the first data store, copy the original data to a second data store, record transactional information associated with the write request, generate a first checkpoint to confirm the successful recordation of the transactional information and the successful copying of the original data to the second data store, write the payload data to the first data store, acknowledge a successful completion of the copy-on-write operation sequence, and generate a second checkpoint that confirms the successful completion of such operation sequence. The first and second checkpoints are used to form a pre-failure representation of one or more storage units (or parts thereof). The checkpoints can be stored with other transactional information, to facilitate recovery in the event of a failure, and can be used to facilitate the use of optimizations to process I/O operations.
U.S. Pat. No. 6,691,245 (Dekoning) discloses a mirrored data storage system utilizing a first host device and a local storage device for primary data storage and a second host device and a remote storage device for mirrored, fail-over storage on behalf of client devices. At periodic intervals (called checkpoints), the first host device initiates data synchronization between itself and the two storage devices and issues checkpoint information to ensure that each device maintains information for a common stable storage state. The local storage device synchronizes its stored data and forwards the checkpoint information to the remote storage device. The remote storage device maintains a copy (called a snapshot) of the data at the common stable storage state. Given the snapshot and the checkpoint information, the remote storage device can restore itself to the common stable storage state in the event of a failure of the first host device and/or the local storage device. Upon failure of the first host device and/or the local storage device, the second host device is instructed to initiate a switch, or fail-over, to serving as the primary data storage on behalf of the client devices.