This invention relates generally to a system and method for backing up a computer system and, more particularly, to a backup system for the recovery and/or restoration of data for a computer system.
The use of and dependency on data in today's society is rapidly expanding. Now more than ever, businesses continuously rely on data in order to operate. Businesses and their customers demand that the data be available and accurate.
Various conventional mechanisms for protecting and recovering data are available for businesses. These so-called backup systems vary in the levels of protection they provide, the amount of time required to recover the backed up data and the difficulty associated with their integration with the businesses' other systems and applications. Generally, the success of these conventional mechanisms is measured in terms of “data availability” i.e., how quickly a system, a database, or a file can be restored after a failure or corruption of data.
Typically, most businesses use some sort of a backup procedure to backup data onto a backup system. There are multiple scenarios in which backup systems can be used. A backup system can be used when a disk is corrupted or otherwise lost. In this scenario, the particular database or application using the disk is quiesced and the information is backed up. Another reason is if a logical corruption occurs and data is lost. In this scenario, the backup system can use logs to determine the proper point in time to which the database or application should be restored.
There are numerous types of backup procedures and systems available. One type of backup can be referred to as a “cold” backup. In a cold backup, the file, database, or application that is being backed up has to be quiesced and cannot be used during the backup process. Moreover, users may be unable to access the files during a full system backup. Accordingly, the cost of performing such backups is greater in terms of user productivity and/or system resources.
Another type of backup can be referred to as a “hot” backup. In a hot backup, the file, database, or application that is being backed up is briefly stopped and placed in a different mode during the backup process.
A snapshot can reduce the amount of time a database or application is stopped. A backup system can use a snapshot in either a cold backup process or a hot backup process. In a cold backup process, the relevant database or application is shut down during the snapshot creation process. Once the snapshot creation is completed, the database is restarted while a backup of the relevant information on the snapshot is performed. In a hot backup process, the relevant database or application needs to enter hot backup mode before the snapshot is taken.
Once the snapshot creation is completed, the database can be brought out of hot backup mode. There is overhead associated with snapshot maintenance that adversely impacts input/output (I/O) throughputs.
Typically, a backup procedure performs a full system backup every time the files are backed up. A full system backup ensures that every file on the system is copied to secondary or redundant storage. A backup process can be established to backup data on a regular or periodic basis (e.g., daily, nightly, weekly, etc.)
However, as present business applications run virtually around the clock with little tolerance for any down time, the time frame or window for backing up data is small if it exists. Snapshot technology minimizes downtime, at the expense of throughput, but can not reduce the backup period. Furthermore, these periodic backups can become obsolete almost immediately after they are completed. Regardless of the frequency of the incremental backups which all require a form of database interruption, there is a constant risk of losing data between them. The risk of losing data can be reduced by performing backups more frequently.
Backed up data can be stored on a storage tape. While storage tapes allow for scheduled backups, recovering of data from them is time consuming. As a result, the availability and performance of the production and application servers are negatively impacted.
In conventional backup processes, a replication technique can be used to replicate the data in a file or database. One type of replication is a synchronous volume replication. In this type, the information is replicated at the same time that the information is being stored. This process requires substantial resources and slows down the overall processing of the computer system. Also, the storage volumes must be consistent with each other. However, replication only protects against the physical loss of data. If data is logically corrupted, both the primary and replicated images are corrupted, and recovery is not possible.
Another type of replication is an asynchronous volume replication. In an asynchronous volume replication process, information is backup up asynchronously with respect to the rate at which it is stored. For example, replication can be delayed with the delay being a set period of time. The period for delay is a window during which a system administrator hopes to stop the replication if an error is identified in order to prevent the replication of corrupted data or information.
One flaw associated with conventional replication methods is that any corruption to the data can be duplicated easily into the image. Conventional replication systems lack historical or chronological information relating to data or data changes. The lack of such information prevents the replication system from providing corruption protection and drives the recovery time up.
Some conventional backup systems capture data that can be used in a subsequent backup process. Data can be captured at a variety of locations in a computer system. Some conventional backup systems generally capture data at the file layer of a computer system. Capturing data at the file layer makes it difficult to support open files or active databases.
Other conventional backup systems capture data at the physical storage layer of a computer system. By capturing data at the physical storage layer, a computer system is unable to maintain consistency across unlinked devices such as database tables on different storage volumes.
Once data is captured, the backup system can use the data in a variety of processes. One such process is the restoration of data on a computer system in the event of a failure or corruption of a computer system. The restoration of data using backed up data is limited by the particular backup system and procedure that were used to collect the data.
Some recovery methods require the application of a database transaction or archive log of data. Some conventional databases maintain a temporary log of data transactions since the last save of data. When the user saves data to the database, the temporary log is wiped out. Because the temporary logs are not maintained, restoration of data requires the user go back in time completely thereby losing some data.
Conventional archive logs only contain forward information, thereby limiting the use and effectiveness of the archive logs in restoring information. By definition, restoration is to a point in the past. The fact that archive logs can only move information forward through time implies that they must be used in conjunction with some other forms of data restoration; such as restoring a cold full backup, in order to achieve a restoration to a point in the past.
In some conventional backups in which data capture is done at the physical layer, but the associated application spans multiple physical storage devices, referential integrity of the data over the independent storage devices can only be achieved if the application is quiesced. In other words, consistency between data spread across multiple physical devices is a property which cannot be maintained by the physical backup system on its own. Coordination between the physical layer and the application layer is required. In effect, the physical layer needs to understand the state of the application that is using it for I/O. To this extent, some conventional physical layer backups require that applications lock users out for a particular amount of time in order to quiesce the data long enough to guarantee consistency across multiple physical devices. This lock-out procedure results in a down time in productivity.
A need exists for an efficient and a cost effective approach to backing up and restoring data after a failure or corruption of data. A need also exists for a backup system and technique that does not negatively impact applications during the backup process. A need also exists for a backup system and technique that reduces the data recovery time and provides for information recovery without requiring a full recovery to a baseline.