1. Field of the Invention
The present invention relates to databases, and more particularly to a method and system for recovery of deleted records in a database. An exemplary embodiment of the present invention provides for recovering database records inadvertently, purposefully or maliciously deleted from a database. This invention is particularly useful for the recovery of deleted database records when standard backup and recovery mechanism and utilities provide inadequate data recovery because of poor backup-system design, operator error, physical computing platform limitations, and/or disregard of backup and recovery mechanisms provided by database-management systems.
2. Description of Related Art
Many individuals and institutions collect, sort, and analyze data and other valuable information using databases. These databases are created, maintained and interfaced with Database Management Systems or “DBMSs.” Such systems generally include a database application (“DBA”), such as Oracle 8i, to provide an interface for a user or operator to enter, edit, sort, modify, analyze or otherwise manipulate collected data. The DBMSs also provide other software applications for managing and maintaining the database. Some of these other software applications or programs provide tools or mechanisms for backing-up and recovering lost or deleted data. These backup and recovery programs, however, may consume tremendous amounts of resources from the computing platforms on which they reside.
Generally, depending on the amount of data or other valuable information to be manipulated, the computing platforms for running the DBMS may vary. Powerful computing platforms may be needed for large amounts of data or manipulations. On the other hand, lesser amounts of data or manipulations may only require simple platforms. Depending on the complexity and the volume of data or manipulations, individuals and institution may desire to balance computing platform cost against computing power. For example, towards the higher end of the power verses cost continuum, a power user's computing platform may include a Sun Enterprise™ 3500 server managed by UNIX operating system that is coupled with several UNIX managed workstations. Conversely, users with less experience and less need for the advanced features of costly applications may only require a personal computer based network or system.
Regardless of the computing platform, however, database users and administrators inadvertently or purposely delete their (and other users) data. When standard backup and recovery mechanisms provided by DBMS are enabled, a user or administrator may easily “undelete” most of the deleted data by using standard utilities provided by the DBMS for such purpose. Despite the potential for loss, however, users or database administrators may fail to invoke the DBMS's mechanisms for data backup and recovery, and thus, the standard utilities become ineffective for recovering deleted data.
When users or administrators fail to invoke these mechanisms (or the mechanisms themselves are corrupt), and the user inadvertently or purposely deletes data, the database application may create inaccessible, deleted records or files. Similarly, after a computing platform or software fault, an equally problematic situation may result because such fault may cause data to appear deleted (or not saved), even though a file containing the data may exist.
By way of example, the Oracle Corporation's family of databases and DBMSs use tools that include standard backup and recovery mechanisms coupled with backup and recovery utilities. One of these tools is Oracle's Enterprise Backup Utility and Recovery Manager. This tool requires that its standard backup and recovery mechanisms be invoked, or the backup and recovery utilities will not work. These standard backup and recovery mechanisms generally require the database application to create log files, which replicate the files that contain the data.
As an illustration, when a user's performs transactions on the user data in an Oracle database, typically, the Oracle DBMS (which includes the Enterprise Backup Utility and Recovery Manager) stores the transactions and related user data in a number of files, namely, datafiles, controlfiles and/or redologs. Each of these files may be used in the recovery of deleted data.
To that end, the datafile is a file that contains the most recent version of the user data. In other words, the datafile contains the user data that reflects the changes from the last committed or performed transaction. So, if the last committed transaction performed included a delete function, then the user data in the datafile will reflect the deletion. Thus, to undo the deletion, the standard backup and recovery utilities typical use other files containing previous versions of the deleted user data to perform a recovery. The datafile, however, may be used by the standard backup and recovery utilities for comparison purposes after locating the deleted user data in another file.
The controlfile is a quasi-database that tracks the status of the transactions in the datafiles and redologs, if any. The controlfile has limited use in the recovery of deleted user data, but may be used by the standard backup and recovery utilities for locating the status of the previously stored datafile.
Redologs are files that track with the datafile and contain previously stored user data, and are the primary file used by the standard backup and recovery utilities for recovery of deleted user data. More specifically, the contents of the redologs track the contents of the datafile, however, the user data contained within the redologs is the data that was in the datafile before a user or administrator “commits” a transaction that changes the information in the database. As can be imagined, the amount of data within redologs can grow quite rapidly. Each redolog, however, is limited in size and continually recycled to prevent from exhausting the resources of the computing platform. Also, more than one redolog may be employed.
These redologs may exist as either an on-line redolog or an archived redolog. Oracle's DBMS provides an archivelog mode that allows the database application to create an archived redolog at some predefined checkpoint, or alternatively, once the on-line redolog fills. After the database application creates the archived redolog, the system may overwrite any information contained in the on-line redolog.
To recover lost information, Oracle's Log Miner recovery utility uses all of the on-line and archived redologs, as well as some of the other stored files. When the archivelog mode is not enabled, the database application will automatically overwrite the on-line redologs. After the database application overwrites the section of the on-line redolog that originally contained the deleted data, access to any of the changes made by the user or administrator is terminated or lost. Consequently, without the archivelog mode active, the Log Miner utility cannot recover the user data once the database application overwrites the on-line redolog because no other on-line or archived redolog contains the deleted data.
Furthermore, the standard backup and recovery utilities' post-deletion recovery may depend on the structure and format of the seemingly intertwined datafiles, controlfile, and redologs. For example, every transaction committed by the user or administrator is tracked by a special parameter termed the System Change Number or “SCN.” This number is contained within each of the Oracle datafiles, as well as each controlfile and redolog. Each time a committed transaction occurs, the SCN increments and is stored in the appropriate datafile, controlfile and redolog.
In recovering deleted user data, the backup and recovery utilities routinely use the SCN to locate the proper datafile and its corresponding redolog. Once the standard backup and recovery utilities locate the appropriate datafile and redolog by SCN, the utility “undeletes” the user data by applying a complicated scheme that revises the appropriate datafile with the information contained within the appropriate redolog. If the SCN does not match in any of the appropriate files, then the standard backup and recovery utilities do not “undelete” the deleted data.
In addition, the database application storage buffers further complicate recovery for the standard backup and recovery utilities. For example, each time a user or administrator performs a transaction or set of transactions modifying the user data, the database application writes the transaction data to a dynamic database buffer cache. Upon a checkpoint, or when the buffer cache is full, the database application writes the buffered data to a physical medium. Similarly, the database application writes to the redolog as quickly and efficiency as possible after writing each transaction to the redolog buffer cache. Depending on the database application timing for buffer writes, the redolog may not contain the SCN or the transaction information necessary for “undeleting” the user data, even though the datafile may house the data.
Since un-invoked backup and recovery mechanisms coupled with ineffective standard recovery utilities may prove incapable of recovering the deleted data, individual and institutions may loose significant time, effort, and money by having to re-enter and recreate the deleted data. Even more distressing is the loss of the data entirely, which may result in indeterminate loses above those reflected in having to re-enter and recreate the deleted data. Therefore, what is needed to remedy this unfortunate situation is a method and system for recovering deleted user data in a database for that provides a software application that is independent of any standard backup and recovery utility. And further, this system and method should provide a software application that physically manipulates the database datafiles to recover the deleted data.