There are any number of problems that can occur when using a computer. Two general categories of errors include computer-caused-errors and operator-caused-errors. Due to the different nature of these two types of errors, a technique designed to recover from computer-caused-errors cannot necessarily be used to recover from human operator-caused-errors.
For example, one technique used to recover a database after a computer-caused-error (such as the failure of a node or process) involves maintaining logs of operations. Specifically, a redo log is maintained so that changes made in volatile memory by transactions that committed before a failure can be made persistent to the database after the failure. Similarly, an undo log is maintained so that changes made persistent by transactions that did not commit before the failure can be removed from the database after the failure.
The log-based recovery technique described above does not address the problem of operator-caused-errors because those errors may be reflected in changes made by committed transactions. Even when the committed transaction that reflects the human error is followed by a computer-caused-error, the log-based recovery operation will merely ensure that those erroneously performed changes continue to be reflected in the database after recovery from the computer-caused-error. Thus, computer-caused-error recovery techniques tend to distinguish between committed changes and uncommitted changes, and not between correct committed changes and erroneous committed changes.
In contrast to computer-caused-error recovery techniques, operator-caused-error recovery techniques focus on removing from the database both committed and uncommitted changes. Specifically, operator-caused-error recovery techniques typically focus on returning the database to a consistent state that existed at a particular point in the past (preferably before the commit time of the transaction that incorporated the operator-caused error). For example, one operator-caused-error recovery technique involves making a backup of the database at a particular point in time. If an operator-caused-error is introduced after that time, the operator-caused-error may be “removed” by reverting to the backup copy of the database.
Of course, a database administrator rarely knows ahead-of-time that an operator-caused-error is going to be introduced. If too much time has passed between the last backup operation and the time of the error, it could be very impractical and inefficient to revert back to the backup database, and then reapply all of the changes that occurred subsequent to the backup operation but prior to the error.
Another technique involves maintaining a “mirror” database whose state is delayed relative to the primary database. In case of a user-caused-error, one can revert to the mirror database. However, if the time it takes to discover the error is greater than the length of the delay, even the delayed mirror will reflect the error. Further, while a long delay will improve the chances that the error will be caught in time, it will also increase the inefficiencies associated with failover to the mirror.
A variation of the delayed-mirror technique involves maintaining multiple delayed mirror databases, where each mirror database has a different delay length. The use of multiple mirrors with different delays increases the likelihood that at least one mirror will represent a state that is before, but not long before, the time of the error. However, the maintenance of such mirrors consumes more resources than there may be available to dedicate to this purpose.
An alternative technique involves storing the database on a storage subsystem that supports “snapshots”, and then using the snapshot mechanism of the subsystem to revert the storage subsystem back to a snapshot time that precedes the error. For example, a storage subsystem may establish a particular “snapshot time” of T5. After T5, each change to a block in the subsystem is handled by (1) determining whether the block has already been changed after T5, and if not, then (2) before making the change to the block, reading the pre-change version of the block from the subsystem and copying it to a special separate “snapshot storage” associated with the T5 snapshot. Using this technique, the storage subsystem can be returned to the state in which it existed at time T5 by copying the blocks from the T5 snapshot storage back over their corresponding blocks in the storage subsystem.
Further, even without reverting the storage subsystem back to its prior state, it is possible to allow processes and transactions to see the state of the subsystem as of time T5 by performing the following when the process or transaction wants to see a specific block: (1), providing a copy of the specific block from the T5 snapshot storage if a copy of the specific block is in the T5 snapshot storage, and (2) providing the copy of the specific block from the storage subsystem only if there is no copy of the block in the T5 snapshot storage.
The snapshot technique provides accurate results, but does so by imposing a potentially significant amount of overhead to all write operations. Specifically, upon the first update to any block after any snapshot time, the pre-update image of the block must be read, and then written out to the appropriate snapshot storage. Further, when the database administrator has to return the storage subsystem to a previous state, the administrator is limited to only those states at which a snapshot time was explicitly established.
Operator-caused-errors are merely one type of error that is not easily removed by applying physiological undo. For example, difficulties may arise when attempting to recover from logical data corruptions. For example, such corruptions may simply be “replayed”, similar to operator-caused-errors, if redo is reapplied.
Based on the foregoing, it is clearly desirable to provide a mechanism and technique for recovering from re-playable errors in manner that does not suffer the efficiency or resource consumption problems inherent in the approaches described in this section.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.