1. Field of the Invention
The present invention relates to managing backups of production data, and more particularly, to providing and managing frozen images of production data.
2. Description of the Related Art
Information drives business. For businesses that increasingly depend on data and information for their day-to-day operations, unplanned downtime due to data loss or data corruption can hurt their reputations and bottom lines. Data can be corrupted or lost due to hardware and/or software failure, as well as due to user error. For example, a user may inadvertently delete a file, write incorrect data to a file, or otherwise corrupt data or equipment. When such errors occur, productivity is lost for both the technicians that must restore data as well as for users that are unable to access valid data.
Businesses are becoming increasingly aware of these costs and are taking measures to plan for and recover from data loss. Often these measures include protecting primary, or production, data, which is ‘live’ data used for operation of the business. Copies of primary data, also called backups, are made on different physical storage devices, and often at remote locations, to ensure that a version of the primary data is consistently and continuously available.
Typical uses of copies of primary data include backup, Decision Support Systems (DSS) data extraction and reports, testing, and trial failover (i.e., testing failure of hardware or software and resuming operations of the hardware or software on a second set of hardware or software). These copies of data are preferably updated as often as possible so that the copies can be used in the event that primary data are corrupted, lost, or otherwise need to be restored.
Two areas of concern when a user error or hardware or software failure occurs, as well as during the subsequent recovery, are preventing data loss and maintaining data consistency between primary and backup data storage areas. One simple strategy includes backing up data onto a storage medium such as a tape, with copies stored in an offsite vault. Duplicate copies of backup tapes may be stored onsite and offsite. However, recovering data from backup tapes requires sequentially reading the tapes. Recovering large amounts of data can take weeks or even months, which can be unacceptable in today's 24×7 business environment.
Large active databases and file systems available around-the-clock are difficult to back up without incurring a significant penalty. Often, the penalty takes one of two forms:                The entire database or file system is taken offline to allow time for the data to be copied, resulting in suspension of service and inconvenience to users. For mission-critical applications, taking the application offline may be impossible.        The copy is made very quickly but produces an incomplete or inconsistent version of the data, because some transactions are in progress and are not yet complete.A way to make backups without incurring such penalties is desired.        
More robust, but more complex, solutions include mirroring data from a primary data storage area to a backup, or “mirror,” storage area in real time as updates are made to the primary data. Periodic “snapshots” of data may be taken by “detaching” a mirror being updated in real time so that it is no longer updated. Detaching the mirror involves halting transactions being applied to the primary data storage area and to the mirror for a very brief time period to allow existing transactions to complete. The snapshot is then taken and provides a logically consistent copy of the primary data. A logically consistent copy of data is referred to herein as a frozen image. The snapshot serves as a frozen image of the primary data as of the point in time that the snapshot was taken. However, snapshots are typically created manually on an as-needed basis and for a specific purpose rather than on a regular schedule.
Most organizations implement a backup policy to keep copies of data for recovery purposes in the event of a system failure or a site becoming unavailable. One or more backup management systems automatically schedules and performs backups in accordance with the backup policy. However, even backup management systems are typically designed to manage only backups to a specific type of storage area (which may span more than one physical storage device), and no single system exists to integrate the different types of backups made. Furthermore, the backup policy is not managed by a single system, but different portions of the backup policy are managed by respective media backup managers.
FIG. 1A shows an example of a typical environment in which backups are made. No integrated system for producing backups exist. Instead, different types of backups are made by different backup management systems, and no single system implements a backup policy. Tape backup manager 111 is responsible for making tape backups of production data, such as data from production file system 122P and production database 132P. Backup tape 112T represents a backup tape produced by tape backup manager 111. Tape backup manager operates according to a corresponding tape backup schedule 118. Tapes are cataloged in tape catalog 116.
File system manager 120 includes a file system backup manager 121 to produce backup copies of file system data, such as backup file system 122B, from production file system 122P. File system backup manager 121 may access file system catalog 126 and file system backup schedule 128 to produce backup file system 122B.
Database management system 130 includes a database backup manager 131 to produce backup copies of production database 132P. Database backup manager 131 may access database catalog 136, or a similar structure (not shown) providing a list of all databases and tables managed by database management system 130. Database backup manager 131 may also access database backup schedule 138 to produce backup database 132B.
Note that each of these three backup systems follows a respective backup schedule and may access a different catalog. No single system produces backups for all types of data and all types of storage areas, and backups of all data are not created on an integrated schedule. Three different backup managers must be accessed to obtain a complete picture of backup data available, and backups are not managed by a common system.
What is needed is a solution to provide complete and accurate backup copies of production data with as little disruption to production systems as is possible. The solution should provide for scheduling and management of all backups, regardless of the type of storage medium or organization of data for storing the backup. The solution should take advantage of existing frozen images that have already been created, such as snapshots taken for other purposes. Preferably the solution provides quick recovery of data from backups in the event of user error or hardware or software failures.