As is known in the art, computer systems that process and store large amounts of data typically include one or more processors in communication with a shared data storage system in which the data is stored. The data storage system can include one or more storage devices, such as disk drives. To minimize data loss, the computer systems can also include a backup storage system in communication with the primary processor and the data storage system.
Known backup storage systems can include a backup storage device (such as tape storage or any other storage mechanism), together with a system for placing data into the storage device and recovering the data from that storage device. To perform a backup, the host copies data from the shared storage system across the network to the backup storage system. Thus, an actual data file can be communicated over the network to the backup storage device.
The shared storage system corresponds to the actual physical storage. For the host to write the backup data over the network to the backup storage system, the host first converts the backup data into file data, i.e., the host retrieves the data from the physical storage system level, and converts the data into application level format (e.g. a file) through a logical volume manager level, a file system level and the application level. When the backup storage device receives the data file, the backup storage system can take the application level data file, and convert it to its appropriate format for the backup storage system. If the backup storage system is a tape-based device, the data is converted to a serial format of blocks or segments.
The EMC Data Manager (EDM) is capable of such backup and restore over a network, as described in numerous publications available from EMC of Hopkinton, Mass., including the EDM User Guide (Network) “Basic EDM Product Manual.” An exemplary prior art backup storage architecture in which a direct connection is established between the shared storage system and the backup storage system is described in U.S. Pat. No. 6,047,294, assigned to assignee of the present invention, entitled Logical Restore from a Physical Backup in Computer Storage System, and incorporated herein by reference.
For large databases, tape-based data backup and restore systems, which are well known in the art, can be used. In general, files, databases and the like are copied to tape media at selected times. Typically, data is periodically backed up to prevent the loss of data due to software errors, human error, hardware failures. Upon detection of an error, in an online database, for example, the backed up data can be restored to effect recovery of the data. While restore refers to obtaining backed up data, data recovery refers to the entire process in which applications can access and use the retrieved data. Transactions since the time of backup can be recreated using so-called redo logs.
Tape-based backup and restore systems have a number of disadvantages. For example, due to the significant amount of time and overhead associated with backing up and restoring data to tape, such operations are performed relatively infrequently. The longer the period between backup and restoration, the more complicated and time consuming the overall recovery process becomes since, for example, this may render it more difficult to determine the point at which an error occurred. In addition, improvements in the data restore process, such as faster tape access times, provide only incremental advances in the overall data recovery process.
Further, data on tape cannot be accessed until it is restored to disk. Only when the data has been restored can a host computer examine the data. The data must be reformatted for each transition between tape and disk, which requires significant processing resources and elapsed time.
A further disadvantage associated with tape-based data storage systems is associated with the data recovery process itself. For example, after an error has occurred an operator, such as a database administrator, evaluates the error in an attempt to find a correct the error. However, the administrator has to deal with limitations imposed by the nature of tape-based storage. For a large mission critical database, it can be prohibitively expensive to shut down the database and perform a restoration from tape. If all possible, the administrator will attempt to perform a repair of the database. However, the risks of corrupting the entire database, causing additional errors, and failing to remedy the error, are significant.
In addition, it is not always known at what time the database became corrupted. In the case where data must be restored from tape, correction of the error can be an iterative and time-consuming process. The administrator may select a first set of tapes for restoration, after which the database can be examined to determine if the error is corrected. If it is not, another set of tapes, which is typically an earlier backup, must be restored. Data examination steps are then performed until the error is corrected.
Once the error is corrected, the error may be re-introduced into the database as post backup transactions are added to the database from the redo logs. The point at which the error occurs must be identified. The time and effort associated with iterative tape restores and error identification can be quite substantial.
One known attempt to identify errors includes so-called data scrubbing tools. These tools, which can be run periodically, are used in an endeavor to detect errors as soon as possible. While such tools may detect errors, many production databases, like those used by Internet-based vendors, are mission critical and cannot handle the loading required by such tools. In many applications, data scrubbing tools are not a practical option.
In addition, there are times at which it is desirable to recover only a portion of a database. However, known systems do not readily enable recovery of less than the entire database. While a portion of a database may be possible in conventional data backup and restore systems, a high level of skill is required to manually recover a portion of a database.
It would, therefore, be desirable to overcome the aforesaid and other disadvantages.