Data stored on computers is lost or damaged every day. Accidents, human error, virus attacks, hardware failures and power problems are just some of the thousands of possible reasons for loss or damage of information stored on a computer. To protect against the unexpected loss of data, smart people (and businesses) commonly backup their files. A backup can be made by simply making a copy of a file or set of files on some kind of removable medium for use in the event of failure or loss of the original, or the data can be compressed as it is copied, using a backup utility. When a data loss or data corruption occurs, the damaged or lost file or files are typically restored from the backup. “Restoring” in this sense, means copying from the removable medium back to the computer or copying and decompressing the data, if a data utility were used. When the files are small, and when a backup is available, restoring files from a backup is a convenient and efficient means to regain information.
As the size, importance and/or the degree to which the files change over time increase, simple copies of files taken periodically are no longer so appealing. For example, suppose a business depends on the reliable availability of a set of very large files that change frequently, as would occur in database files maintained by an airline, for instance. Periodic snapshots of the data (a set of files and directories taken at a particular point in time) may no longer be sufficient. Mirroring may be a better choice. A mirror in computing is a direct copy of a data set such that there are exact duplicate copies of the data on separate machines. The copies are created and then are continually updated so that the copies stay synchronized with the principal database. The mirror can be maintained as a physical copy at the hardware level or through database mechanisms (sometimes called “replication”). A mirror is differentiated from a snapshot in that a snapshot represents the state of the file or database at a particular point in time. A mirror, in contrast, is an active, dynamic copy which is kept up to date with a dynamically changing source.
When a small portion of a database becomes corrupt, the option of restoring the entire database from backups is not optimal because most of the work performed is unnecessary (most of the database is fine). The restoration process is slow, requires the handling of external media (backup tapes or backup disks) and requires human intervention (a database administrator to select which backups to use, etc., a computer operator to find and load the tapes, maybe others). Furthermore, while the restore process is occurring, the database is typically not available to users. Another way to handle the corruption of a page is to try to repair the page. Repairing a page is fast but almost always results in partial or complete loss of the page data, causing logical inconsistencies within the database.
It would be helpful if there were a way to regain the data stored on a corrupted page (a page is a fixed number of bytes of data recognized as a unit by the DBMS, usually 8K bytes) that would be fast and would result in no lost data or data inconsistencies. It may be useful to have this process initiate automatically upon detection of the data corruption and occur without human intervention, without requiring the management and handling of tapes or other removable media.