1. Field of the Invention
The present invention relates to recovery management software, and more particularly to a system and method for creating a series of online snapshots for recovery purposes while online user access to the data is available.
2. Description of the Related Art
With the proliferation of large database systems, the need for effective recovery solutions has become a critical requirement for the safe management of customer data. Data management requires time, storage and processor resources, yet all are in ever-shorter supply in today""s complex computing environment. Traditional backups require either a lengthy outage of the database while a cold copy is performed or the consumption of significant system resources while online backups are taken. These traditional techniques are inadequate to meet the needs of today""s high availability requirements. Making backups of mission critical data stored in database files on open systems is part of doing business. One problem with creating a consistent point-in-time backup or image is that it requires taking the system offline, thus decreasing data availability.
It is desirable to have an easy, reliable, and unobtrusive method for creating or obtaining a consistent point-in-time copy or image of a database (e.g., an Oracle database), or any file or file system, while the data remains online and available for update. In the case of an Oracle database, for example, traditional Oracle warm backup requires expensive archiving of online redo logs. It is desirable to enable online database backups without requiring the overhead of logs to be maintained and those logs to be applied in order to recover the data.
It is also desirable to create or obtain a consistent point-in-time copy or image of data with or without specialized hardware (e.g., Intelligent Storage Devices). As used herein, an xe2x80x9cIntelligent Storage Devicexe2x80x9d is a storage device that provides one or more of: continuous data availability, high reliability, redundancy of critical components (e.g., mirroring), nondisruptive upgrades and repair of critical components, high performance, high scalability, and access to shared and secured heterogeneous server environments (e.g., mainframes, UNIX-based systems, Microsoft Windows-based systems). Typically, ISDs are used for backup and recovery, data replication, and disaster recovery.
Various hardware vendors offer Intelligent Storage Device (ISDs): Hitachi Data Systems (Freedom Storage 7700E with ShadowImage mirrors), Hewlett-Packard Company (SureStore Disk Array XP256 with Business Copy mirrors), and EMC Corporation (Symmetrix with Timefinder mirrors), among others.
It also desirable to have an easy, reliable, fast, and clean method for restoring a consistent point-in-time copy or image of a database (e.g., an Oracle database), or any file or file system, when some event happens that causes a xe2x80x9crecoverxe2x80x9d of the data to be necessary. Thus, the nature of the event that causes a xe2x80x9crecoverxe2x80x9d of the data to be necessary is irrelevant.
For the foregoing reasons, there is a need for a system and method for creating a series of online snapshots for recovery purposes while online user access to the data remains available.
The present invention provides various embodiments of a method and system for creating a series of online snapshots for recovery purposes. In one embodiment, one or more snapshots (e.g., file snapshots or database file snapshots) may be created over a user-specified time interval at a user-specified frequency. The one or more snapshots may be a series of concurrent, overlapping snapshots constructed by creating snapshots over a user-specified time interval at a user-specified frequency. For each snapshot, one or more files may be targeted for snapback by being registered with a snapshot software component technology by a software utility (e.g., a file backup and recovery management utility or a database backup and recovery management utility). In one embodiment, the files targeted for snapback may be database files associated with a database. Alternatively, the files targeted for snapback may be any type of computer-readable files. Prior to registering one or more files with the snapshot software component technology, initialization processing may be executed. The initialization processing may prepare the one or more files for processing by the client utility.
The snapshot software component technology may determine an appropriate methodology to handle read requests and write requests received during the snapshot of each registered file. The appropriate methodology chosen for each registered file may be independent of the chosen methodology for the other registered files. In one embodiment, one of the following methodologies may be chosen for each registered file: a software based methodology using a memory cache, a software based methodology using a disk cache, or a hardware based methodology using an intelligent storage device.
After determining an appropriate methodology, the snapshot software component technology may be started. In the case of a database snapshot, prior to starting the snapshot software component technology, the database may be synchronized or stopped and quiesced. It is noted that various database management systems may synchronize and/or stop and/or quiesce the database. In one embodiment, the synchronizing or quiescing may shut the database down. In another embodiment, the synchronizing or quiescing may place database objects in a certain mode that is proprietary to a particular DBMS. After the synchronization or quiesce is completed, the database may be restarted.
In the case of the hardware based methodology, the starting procedure may include splitting the mirror volume 204 from the primary volume 200, and making the data on the mirror volume 204 available for processing by the device driver 112 (shown in FIG. 2).
After the snapshot software component technology has been started, read requests and write requests may be operable to be performed concurrently with the snapshot processing of each registered file. For example, the processing of read requests from the registered files and write requests to the registered files may occur concurrently with the snapshot processing of each registered file.
Processing for the software based methodology may include: capturing client reads for each registered file; for each captured client read, if the read is for updated data, returning the data from the cache; for each captured client read, if the read is for non-updated data, returning the data from the registered file; capturing writes to each registered file; for each captured write to a registered file, prior to allowing the captured write to complete, saving a pre-image of the appropriate data block of the registered file to a cache if the given data block of the registered file has no previously saved pre-image in the cache.
Processing for the hardware based methodology may include: capturing client reads for each registered file; for each captured client read, returning the data from a mirrored volume; allowing normal write processing to a primary volume for all write requests, without capturing them.
Each registered file may be targeted for snapback such that the processing by the client utility is consistent with the state of each registered file at the point in time of the start of the snapshot software component technology. In the case of a database being targeted for snapback, the processing by the client utility may be consistent with the state of the database at the point in time of the start of the snapshot software component technology. Targeting each registered file for snapback may include copying a pre-image version of updated data to a cache. The location from which the pre-image version of updated data is retrieved during the snapback may be dependent upon the chosen methodology (i.e., software based or hardware based). If the chosen methodology is the software based methodology, the location from which the pre-image version of updated data is retrieved during the snapback may be the memory cache or alternatively may be the disk cache. If the chosen methodology is the hardware based methodology, the location from which the pre-image version of updated data is retrieved during the snapback may be the intelligent storage device.
In one embodiment, the snapshot software component technology may be stopped when deemed appropriate by the backup and recovery management utility in order to prepare for snapback of the registered files. After the snapback has completed, termination processing may be executed.
The user may specify the start time of the first snapshot instance, and the user may also specify the time interval to wait prior to starting the next snapshot instance. For example, the user may specify ten minutes as a uniform time interval for the series of snapshot instances. The time intervals between the start times of adjacent snapshot instances need not be uniform. Other methods may be used to determine the interval between the start times of adjacent snapshot instances, including user-defined methods. In one embodiment, the user may specify an ending time, and/or a certain number of snapshot instances. Any number of snapshot instances may be scheduled by the user, subject to the limitations of the user""s environment (e.g., amount of disk space available for the snapshots to be stored).
Monitoring for a recovery indication may occur during the user-specified time interval. For purposes of the restore using the series of snapshot instances, the nature or reason for the xe2x80x9crecoveryxe2x80x9d request are irrelevant.
Once it is established that a xe2x80x9crecoveryxe2x80x9d is necessary, a snapback procedure may be implemented. The snapback process may restore one or more pre-update snapshot images. The process of restoring the pre-update snapshot images may be iterative. The smaller the number of updates, the quicker the restore process will complete. A first pre-update snapshot image of the one or more pre-update snapshot images may be restored. The data may then be tested to determine if the problem has been resolved. In the event that the problem still exists, a second pre-update snapshot image may be restored, followed by a second testing of the data to determine if the problem has been resolved. In the event that the problem still exists, the process of restoring a subsequent pre-update snapshot image followed by testing of the data to determine if the problem still exists may be repeated until it is determined by testing that the problem has been resolved.