Organizations use computer databases to store, organize, and analyze some of their most important information. For example, a business may employ a database to warehouse its sales and ordering information so that analysts can predict trends in product sales or perform other kinds of data mining for long-range planning. Because database systems are responsible for managing information vital to the organization's operation, it is crucial for mission-critical database systems to implement mechanisms for recovery following a database system failure.
Recovery from all but the most serious kinds of failures generally relies on periodic backups, which are done to save the state of the database to a longer term storage medium, such as magnetic tape or optical disk. Because users continue to modify the database since the time of the last backup, the users' committed transactions are recorded in one or more “redo logs” on disk. Thus, to recover from a system crash, the periodic backup is used to restore the database system, and the committed transactions in the redo logs are reapplied to bring the database up to date to the state at the time of the system crash.
Some failures are more serious, however. For example, a hard disk can be unreadable after a head crash. Earthquakes, fires, floods, tornadoes, and other acts of God can physically destroy the disk upon which the redo logs are saved. In these cases, the modifications and updates to the database after the last backup are permanently lost. Thus, for mission-critical database systems, a more robust approach for disaster recovery is needed. Moreover, restoration of a database from backups and redo logs is a time consuming process, and some organizations cannot afford the necessary downtime.
Accordingly, there has been much interest in implementing disaster recovery by deploying a “standby” database system that is a replica of the business's primary database system. The standby database is typically created from a backup of the primary database, and the primary database and the standby database coordinate with each other such that the standby database keeps up with changes made on the primary database. In the event of an irrecoverable crash or other disaster, the standby database can quickly be activated to become the business's new primary database without having to wait for restoring the primary database from the last backup and redo logs. To lessen the effects of disaster to the physical premises of the organization's computing equipment, it is desirable to deploy the standby database in another geographical location, such as in another city, state, country, or continent. For example, an earthquake in San Francisco is unlikely to destroy a standby database in Boston. Consequently, the primary database and the standby database typically have to communicate with one another across a network connection. Two approaches have generally been used: a “batch” approach and a “synchronous” approach.
The implementation shown in FIG. 5 illustrates the batch approach for maintaining a standby database. In this approach, a database application 500 is in primary communication with a primary system 501 but can also be in communication, when necessary, with a standby system 503, which are in different geographical locales, e.g. San Francisco and Boston, respectively. During normal operation, the database application 500 submits statements to a primary database 510 of the primary system 500. These statements cause the primary database 510 to store or retrieve data in response. When a change is committed to the primary database 510, the primary database 510 creates a redo record that describes the change and invokes a log writer process 511 to save the redo record to disk in one of a number of primary redo logs 513. Meanwhile, in the background, an archiver process 515 inspects the primary redo logs 513 and saves the redo records in primary archive logs 517. For non-disaster crash recovery, changes stored in the primary redo logs 515 and the primary archive logs 517 can be applied to a system backup of the primary database 510 to bring the database up-to-date as it existed at the time of the system crash. The archiver process 515 also transmits the redo records to the standby system 503. Specifically, a remote file server 531 receives the transmitted redo records and updates the standby archive logs 533. A managed recovery processes 535 periodically inspects the standby archive logs 535 and applies the changes to a standby database 530, which is ready to be used by the database application 500 in case of a failure in the primary database system 510. In some situations, people find it convenient to deploy the standby database 530 in read-only mode as an independent reporting database.
The batch approach, however, incurs a high risk of data loss in case of a primary system 510 crash, because the archiver process 515 works in a batch mode to ship the redo records to the standby database system 530. When the primary database system 510 crashes, the changes in the primary redo logs 513 have not yet been shipped to the standby database system 530. As a result, the standby database system 530 is unaware of these changes. These changes, accordingly, are unavailable to the database application 500 when it switches over to the standby database system 530. Moreover, it is difficult to characterize the amount of the data lost in terms that database owners can best understand. The maximum exposure for loss of data in this approach is usually described in terms of the size of the redo logs, but this information is not helpful for database owners, who would rather want to know how many orders were lost. Another way to characterize the amount of data lost is by time, for example, “within the last five minutes,” but this also would not tell the database owner how may sales orders were involved.
By contrast, the synchronous approach is capable of ensuring that the standby database records every committed transaction, i.e. with zero data loss, but at a substantial performance penalty. Referring to FIG. 6, a database application 600 is in primary communication with a primary system 601 and, if necessary, also in communication with a standby system 603. During normal operation, the database application 600 submits statements to a primary database 610 of the primary system 601. These statements cause the primary database 610 to store or retrieve data in response. When a change is committed to the primary database 610, the primary database 610 creates a redo record that describes the change and invokes a log writer process 611 to save the redo record to disk in one of several primary redo logs 613. Meanwhile, a primary archiver process 615 in the background inspects the primary redo logs 613 and saves the redo records in primary archive logs 617 for use in non-disaster recovery procedures. The log writer process 611 also transmits the redo records and transaction commits to the standby database 530. Specifically, a remote file server 631 receives the transmitted redo records and transactions commits and updates the corresponding standby redo logs 633. A standby archiver process 635 also inspects the standby redo logs 633 and saves the redo records in standby archive logs 637. A managed recovery processes 639 periodically inspects the standby archive logs 637 and applies the changes to a standby database 630, which is ready to be used by the database application 600 in case of a failure in the primary database system 603.
The synchronous approach achieves zero data loss at substantial performance penalty, because the log writer process 611 does not acknowledge the commit until the remote file server 631 signals back that the transmitted transaction commit has been received, stored, and made available in the standby redo logs 633. Thus, every change acknowledged on the primary database system 610 as committed must incur a round trip network latency between the log writer 611 and the remote file server 631. This network latency is substantial and degrades performance for every transaction that the primary database 610 commits. By contrast, the performance penalty of the batch approach is less severe, which typically incurs marginal additional overhead in terms of processing and disk input/output resources.
Therefore, there is a need for a disaster recovery methodology that improves data availability over the batch approach, while providing better performance than that of the synchronous, zero data loss approach.