Many computer files are stored on devices such as conventional hard disk drives. Because such devices can fail, resulting in the loss of some or all of the data they contain, many users make copies of their computer files on the same or different types of devices. Such copies of files are commonly referred to as "backup files" or "backups". Two commonly used devices on which users store backup files are tape drives and disk drives.
There are several types of backup files employed to duplicate files used by a transactional system such as a database system. "Full backup" files are used to copy an entire file on to a tape. If the copy is made to a disk drive, the file is called a "datafile copy". "Incremental backup" files store data that has changed since a last backup file was made. Two types of incremental backups may be employed. A "standard incremental backup" backs up data from the most recent full backup, datafile copy or the most recent incremental backup. A "cumulative incremental backup" stores data that has changed since the last full backup or datafile copy. For example, if a full backup is made on Sunday, an incremental backup is made on Monday, a standard incremental backup made on Tuesday will contain the changes made since the incremental backup on Monday; however a cumulative incremental backup made on Tuesday will contain all changes made since the full backup on Sunday.
Two other types of files exist which are related to backup files. An "offline range" identifies a period of time during which a file is guaranteed not to have changed. A file that is configured as "read-only" is such a file. The existence of an offline range allows a user to skip making some or all scheduled back ups during the period identified by the offline range without losing integrity of the backup files. An "online range" is the opposite of an offline range, containing periods during which the file may have been changed, with other periods being offline ranges by inference. The offline periods inferred by the online range may be used like offline ranges. Although offline ranges and online ranges are not technically backup files, as used herein, the term, "backup files", includes offline ranges, online ranges and offline periods inferred from online ranges.
If a file becomes damaged or corrupt, restoration of the file using conventional methods is performed using the steps described herein. First, the most recent full backup or datafile copy is restored, then, incremental backups are applied to the restored full backup or datafile copy to bring the file a as near as possible to the desired point in time during which the file was known to be good. If the incremental backups are insufficient to restore the file to the desired point in time, transaction logs, which record each transaction applied to the file by the transactional system, may be applied from the time of the most recent incremental backup to the desired point in time.
The restoration of transactional files can pose special problems. A file may be backed up on Sunday, Monday and Tuesday as described above. However, on Tuesday after the incremental backup, the file may be found to be corrupted. The user may determine that the file was in good condition through the Monday backup, but not further. The file may be restored using the backups of Sunday and Monday and the transactional system restarted using this restored file. This is called an "incomplete recovery." Assume on Wednesday, the file is backed up using a standard incremental backup and on Thursday, the file is determined to be corrupt again after the Wednesday backup file was made. To properly restore the file to the point in time of the Wednesday backup file, it is necessary to use the Sunday, Monday, and Wednesday backups, but it is also necessary to omit restoring the Tuesday backup file containing the corrupt information.
In a transactional system, certain information is provided to assist the person performing the restoration of the files in this endeavor. In addition to identifying the backup files with the file name of the file, some conventional transactional systems maintain additional information about each backup file to allow the party restoring the file to identify the proper file to restore. This information may make use of a transaction counter or similar identifier. Some conventional transactional systems such as the Oracle 8 database product commercially available from Oracle Corporation of Redwood Shores, Calif. maintain a transaction counter, which counts transactions performed by the system. When a database is created, the transaction counter is set to zero, and incremented by one or more for each transaction performed. A pair of identifiers known as a "From-SCN" and "To-SCN" record the state of the transaction counter at a starting point and ending point, and delimit the time period covered by the backup. In addition, when a database is created or undergoes incomplete recovery, the database is assigned a new reset stamp, a unique identifier created by coupling the current value of the transaction counter with the an integer representation of the current system date and time. Each file opened with write access for use by the database is marked with the reset stamp of the database. Thus, in the example above, when created, the Sunday, Monday and Tuesday backups would have the same reset stamp. When the Tuesday backup was restored and opened for write access, the database would be assigned a new reset stamp using the state of the transaction counter at the time the database was incompletely recovered on Tuesday and the database would write the reset stamp into the restored copy of the Tuesday backup. The Wednesday backup would have this same reset stamp. The reset stamp of the Wednesday backup files, being different from the other archived backup files, would alert the person performing the restoration of the backup files that the file had been restored earlier, and the value of the reset stamp of the Wednesday backup files would identify the Tuesday archived backup as having been superseded.
In some transactional systems, such as the Oracle8 product, a separate data base table is maintained for each type of backup file. In one embodiment, each table contains one record per backup file of the type stored by the table. Each record contains the name of the file, the location of the backup file, the To-SCN, the From-SCN and the reset stamp of the backup file. This information may be used by the party restoring the database files to allow the party to properly restore the database file. In addition, a linked list of reset stamps is maintained in a file to allow a user to identify the sequence of reset stamps generated for a file. Nevertheless, because some files must be omitted from restoration as described above, identifying the proper files to restore has remained a tedious, error-prone task.
In addition to the problem of the correct identification of backup files, because both standard incremental backup files as well as cumulative incremental backup files may exist, identifying all the files with the proper reset stamps and SCNs may mean that the operator restoring the files may restore a standard incremental backup file containing information that is or will be restored from the cumulative backup. The duplication is not harmful to the file, because information already restored is simply overwritten with the same information. However, because each backup file must be located, and the tape containing the file mounted, searched and read, restoration of information already restored wastes time and resources.
Therefore, there is a need for a system and method for restoring backup files that identifies the proper backup file or files to be used and reduces the number of backup files used to restore a file.