Unlike a regular application that saves its contents in a file, a database is a structured logical data store that usually stores its data set in multiple files or storage volumes. A database may be a relational database, an object-oriented database, an email server (such as Microsoft Exchange), or a content management database. In a typical database server, there may be one or multiple databases; in some cases, some databases are for storing configuration information and others for user data.
In a relational database server, there are “system tables” and “user defined tables.” System tables contain configuration data and user defined schemas, whereas user defined tables typically contain user and business data. For example, a customer database would contain customer data, while the structure of the database (e.g., name, phone number, etc.) is stored in the system tables in the form of schemas. In some database servers, all tables are stored in one database, while other database servers store system tables and user tables in multiple databases.
Although each database has its own files or storage volumes, the databases served by one database server as a group may be interrelated. For example, if a user table schema is lost from the system tables, the user data would not be accessible from the user defined tables. Therefore, the success of a data protection (i.e. backup and recovery) solution in recovering a database or a database server lies in its ability to reconstruct one or more related data sets to a consistent state when all the related files and devices are in synchrony at that recovery time point. When the data state of a database or a related set of databases of a server is not in synchrony, the data sets would be incorrect or corrupted and may become inaccessible by the database server (and thus inaccessible by the associated applications and users).
A database may store its data in a file system or directly into raw storage volumes. These data include the binary data (e.g., tables for relational databases) and a transaction log (write-ahead log). A database may store its transaction log in the same device or file system with its binary data, or it may store its log in different storage devices or different file systems. The files of all databases that belong to the same database server may reside in the same or different file system or device. As mentioned, a database data set can be stored either in files or in storage volumes, for simplicity, the following sections refer to database store as files.
Conventional data protection solutions for backing up and restoring a database server include scheduled tape based archiving and volume based snapshots. Although in most cases these solutions provide consistent recovery, they cannot eliminate or minimize data loss when failure occurs. They also do not address application downtime during data recovery. Over the last several years, disk/block base real-time journaling solution has emerged, which attempts to minimize data lost. A block-based real-time journaling solution is commonly known as Continuous Data Protection (CDP). Although addressing data loss, CDP cannot guarantee recovery consistency and integrity.
Currently, tape archive database protection solutions usually are scheduled to run once or twice a day while a database server is either shutdown or in quiescent mode (hot backup mode) during which the databases of the server is in consistent state.
Usually a database vendor provides a specific application programming interface (API) for the archive solution vendors. These API are designed for the following purposes—1) to enter hot backup mode (unless cold backup is carried out—i.e. server shutdown); 2) to retrieve full copy of binary data and log files; 3) to retrieve changes to binary data of a database after the last archive (for incremental backup); 4) to retrieve log changes of a specific database after the last archive; 5) to restore a database from the backup media; and 6) to apply incremental backup to a restoring database.
With the above API, tape backup vendors provide tape archive solutions. A database administrator may use a tape archive solution to perform a full archive once a week, and then perform daily incremental backup. The database administrator may also use a script to archive log files in hourly bases (or once several hours).
When recovery is necessary, the tape archive solutions provide a user interface for a database administrator (DBA) to restore the databases of a server from a full backup tape set. After that, the DBA may have to apply incremental backup tape set manually. The final step is to roll-forward the archived log to bring the database server to a point-in-time closest to the failure point.
There are many well-known problems with using tape archive solutions for backing up and recovering a database server. First, databases must be copied in full and then in change incremental from the host server to the tape media upon schedule. During copying this large amount of data, the database server is either shutdown or in a much degraded mode for a long time period. Second, data is at risk in between backups. Third, recovery is manual, in many cases, incremental recovery is necessary. The log roll-forward is also manual. The entire restoring process may cost hours if not days of downtime depending on the data set size and the scale of incremental recovery and log roll-forward. In addition, data retrieval from tape media is slow, unreliable, and error prone. In some cases, recovery fails because of tape mishandling or media failure.
Volume snapshot solutions are designed to address the shrinking tape backup window (copying of data to tape with a limited backup window), and to eliminate manual incremental recovery because incremental backup is unnecessary.
There are many volume snapshot techniques—such as split mirror, copy-on-write, and so on. These techniques offer different resource usage optimization and snapshot efficiency. The objective of all snapshot techniques is the same, which is to create a point-in-time storage device image. Unlike tape archive solutions, a snapshot usually can be taken with much shorter time, some volume snapshot techniques can complete a snapshot within seconds. To generate consistent snapshots, most snapshot services involve a quiesce of a database server (so that the server first flushes out its updates to the storage from system memory to make a persistent copy of its databases consistent), and then freezing of I/O activities during which a volume snapshot can be taken. When a snapshot is taken without quiesce of a database server (i.e., a hot snapshot), the disk images may not be consistent and may not be usable for recovery. In particular, the image may include fracture blocks, which cannot be used for recovery.
Because database files may be stored in multiple storage volumes, a consistent database snapshot requires that all volumes be captured at the same moment (i.e. consistency group); otherwise, the snapshot would not be useful for recovery. Some snapshot solutions are not capable of capturing consistency group.
During recovery, if a snapshot image is consistent and non-corrupted, the snapshot image can be copied to a recovering server and the database server can be started from that recovered data. After that, if there are archived logs, the log files can be used to roll-forward the binary files manually. If a hot snapshot is taken, the database administrator must first copy the databases into a recovering server, repair the databases and perform verification test. If the repair is successful, the database server can roll-forward the binary files using archive log files. In case the image is not repairable, another point-in-time snapshot must be selected, and the entire copy, repair, and verification process must be repeated.
Because snapshot is still schedule-based, in between snapshot intervals, there are still risks of losing production data. Although a snapshot can be taken very quickly, there is still requirement to quiesce a database server to get consistent snapshot image, therefore there is still server performance impact. When hot snapshot is taken, server performance impact is eliminated, however, that results in potential inconsistent snapshot images and increases recovery downtime as data repair and verification may be needed. With a volume snapshot approach, the entire recovery process is still manual; a database administrator must know which snapshot volume has what data and reconstruct the database storage volumes manually. This gets harder when database binary data and transaction logs are stored in distributed volumes. The entire recovery process may takes hours or days, and it involves identifying the snapshot volume image set, copy back the snapshot volumes into the right storage devices, repairing and verifying the integrity of the database, bring up database server, and manually roll-forward the necessary archived log.
A traditional real-time protection approach known as continuous data protection (CDP) is used to capture all the storage block changes of the primary storage devices that store the databases of a database server. The primary goal of this approach is to eliminate data loss causes by scheduled backup.
This approach involves first making a full image of the storage volumes that belong to a database server. After that, it continuously records all the block changes (persistent storage updates) as it occurs during runtime. Periodically, another full image may be taken. The storage for recording the block journal is known as time-addressable storage. Because the block updates are recorded continuously as they occur, one can reconstruct a storage volume from the time-addressable storage to any point-in-time by applying block changes to an initial full volume image. The reconstructed volume, however, is similar to a volume image taken from hot snapshot (see above) as the database server is most likely not in quiescent state; therefore, the database is most likely not in consistent state, and the data may be corrupted if fractured blocks are captured at that point-in-time.
Some CDP allows a database administrator to save a marker on the time-addressable storage, in which case a database administrator can quiese a database server and put a tag on the CDP storage to indicate that at that specific moment in time the volumes are in consistent state and the data is non-corrupted. The point-in-time tagging of the storage images is no different than a consistent snapshot. If only these tagged images are used for recovery, the benefit of performing real-time protection is not realized.
When recovery is needed for a specific point-in-time, a full image volume set from the time-addressable storage that is closest to the recovery point can be copied to the appropriate primary storage volumes of the database server. After that, the necessary recorded block journal can be applied to the recovering volumes to bring its state to a point-in-time.
Because the continuous journal of the primary storage results in continuous hot snapshot images being stored, one cannot guarantee that a recovered database server from the time-addressable storage is non-corrupted. Therefore, recovery could be a process of trial and error—first a database server data image is reconstructed from the time-addressable storage; second a database integrity verification process would determine if the database can be repaired. If the recovered database volumes are found to be corrupted, another point-in-time must be picked and the process starts again. The entire recovery process is no better then recovering a database from snapshots, in fact, in many cases, it could be worst. When logs and binary data are stored in multiple distributed volumes, the volume reconstruction process must be manually managed to ensure that all volumes are recovered correctly. When consistency tag is applied on the time-addressable storage, then the recovery result is no different then taking periodical consistent snapshot. In addition, time-addressable storage requires huge amount of storage space to continuously track the block changes.
The conventional approaches described above are not able to reduce data loss to near zero; indeed, even when CDP attempts to record block journal continuously, data loss may still occurs when a consistent database server cannot be reconstructed. IN attempting to reduce data loss, CDP however greatly increases backup storage cost and still cannot guarantee recovery consistency and integrity. All the conventional solutions require intensive manual handling of tape media, volume images, and archived logs. In addition, the recovery process is manual and error prone.
In addition, both snapshot and CDP manage data in the block level (storage device level) without any meta-information; these solutions do not know when a new volume is added to a database. As a result, managing these solutions requires manual work. Further, because these solutions do not have knowledge of what is inside the blocks, and because they are not capable of providing a backup catalog (user interface), granular recovery (e.g., database files, table space, transaction, etc.) is not possible.