All database systems are vulnerable to failures that can cause data loss or corruption. Commercial database systems (e.g., SQL Server, Oracle, DB2, Sybase, etc.) typically provide a mechanism by which data may be restored in the event of a failure. Most commercial database systems restore data by utilizing the most recent backup copy of the database in conjunction with the database""s transaction log. In order to accomplish data restoration using the traditional rollback mechanism, the database management software typically utilizes backups of a full image database backup and a transaction log. A transaction log is a file that records all changes to user and system data since the last full image database backup. The transaction log captures the state of the database before and after changes are made. Transaction logs contain less information than databases, and are thus backed up more frequently than databases. In a typical database system, several transaction log backups would occur between each successive full image database backup.
Many prior art database reconstruction techniques follow the procedure detailed below. These prior art reconstruction techniques restore the database by using the most recent full image backup. The database is then xe2x80x9crolled forwardxe2x80x9d to a point in time, just prior to the time of the failure, by reapplying every transaction from each transaction log backup file saved since the last full image database backup. This procedure effectively restores the database to the state in which it existed just prior to the database failure. One drawback of a roll-forward database reconstruction is that all transactions that were started, and possibly completed, after the time of the failure are lost.
These traditional backup and recovery techniques are designed to protect data from hardware and media failures, such as disk crashes. In a typical database system, however, a more likely cause of data corruption is a user or application error. An incorrectly formed ad-hoc SQL query or an application software error may inappropriately delete or modify data in the database. Traditional restore and roll forward mechanisms are sub-optimal and inefficient for recovering data from user or application errors.
In the first place, roll-forward reconstruction requires restoring a vast amount of data. The set of transactions that occurred since the last database backup, which must be reapplied during recovery from a system failure, is typically much larger than a set of transactions that could be xe2x80x9crolled back,xe2x80x9d thereby correcting a user or application error. xe2x80x9cUndoingxe2x80x9d a small set of erroneous transactions is a far more efficient method of restoring a database than is reapplying all valid transactions that occurred prior to a failure.
In addition, the traditional roll forward mechanism restores the database by rolling forward a backup image to a specific point in time. When recovering from user or application errors, this point in time is chosen to be the start time of an erroneous transaction that corrupted the database. By rolling the database forward to this particular point in time, the traditional mechanism effectively removes all transactions that may have occurred after the start of the erroneous transaction. By only rolling forward to the beginning point of the erroneous transaction, many transactions that occurred subsequent to the error, but that may have nonetheless have been valid transactions wholly independent of the database failure, are lost.
Some database systems (e.g., Microsoft SQL Server, Oracle) provide a recovery feature in which a xe2x80x9crestore and roll forwardxe2x80x9d may be performed on an individual table basis. Although to some extent this feature mitigates the inefficiency of the roll forward method described above, it is still not sufficiently granular to permit the selection of erroneous transactions only. Furthermore, this feature applies the selectivity criteria at the database object level, not the transaction level, resulting in recovery of only coarse-grained database objects, such as a table. None of the prior art restore and roll forward methods allow recovery of fine-grained objects, such as an individual row within a table. There is thus a need for a method and apparatus for restoring a database that mitigates the inefficiencies of traditional roll forward techniques, while simultaneously allowing recovery of fine-grained objects.
The present invention is directed toward a system and method for reconstructing database objects using transaction-selective rollback. The present invention can be used in Microsoft SQL Server databases, as well as various other commercial databases, such as SyBase and Oracle databases. The embodiments disclosed herein overcome drawbacks of the prior art by efficiently rolling back only those transactions thought to be erroneous. In addition, the present invention allows users to reconstruct tables within databases with finer granularity than database reconstruction systems and methods of the prior art. In one embodiment, the present invention provides a method of constructing a rollback script comprised of commands used to restore the contents of the database to a state in which it existed prior to a series of transactions. In essence, this embodiment could be used to undo Insert, Delete, or Modify commands performed on the database. The present invention allows a user to exercise selectivity in the transactions that will ultimately be undone. A user could, for example, specify a certain time range within which the present invention could undo all transactions performed in the database. Additional selection criteria give a user the ability to undo, for example, transactions performed by a specific user or on a specific row. In addition, these selection criteria can be combined so that the present invention could undo, for example, all transactions performed by a specific user during a predetermined time range.
In an additional embodiment, the present invention allows a user to track all changes performed on a specified row within a database. This embodiment can reconstruct the contents of the row at any given time, even in databases where the only changes recorded in the log are delta changes. This embodiment is capable of being used when the database has a clustered key as well as when there is no clustered key indicating the new location of a row that has moved.
Additional embodiments are directed toward searching techniques wherein binary search techniques are used to take advantage of unique log record information contained in the transaction log. These search techniques allow the invention to quickly locate log records of interest within the transaction log. Additional embodiments are directed to computer-executable processes steps stored on a computer-readable medium for carrying out the steps associated with the embodiments described herein. In addition, the present invention is directed to an apparatus with memory means and a processor, configured to execute the steps associated with the various embodiments disclosed herein.