1. Field of the Invention
The present invention relates generally to database management systems and more particularly to a method of configuring a database management system for automatic failover from a primary database server to a standby database server and subsequent recovery of the failed primary database server.
2. Description of Related Art
As government and business store increasing amounts of data in database systems, there are increasing demands to have such data always available, even in the face of catastrophic failure of computer hardware, network outage, disastrous data corruption, etc. To meet these requirements database system engineers have developed a number of features to have database data replicated in a number of different computer systems. Once data is replicated from one database system to another, if the first database system fails or otherwise becomes unavailable, the second database is used for processing database requests. The process of switching from an unavailable first database system to a second database system is commonly known as failover. Replication features such as those just described are available under the name Oracle Data Guard in relational database systems manufactured by Oracle Corporation of Redwood City, Calif.
FIG. 1 shows a database system that uses Data Guard to replicate data to multiple standby databases across a network. Replicated database system 101 contains primary database 103 and two standby databases 113 and 121. Primary database 103 contains database information including database tables and meta-data. Updates made to the primary database 103 are transmitted via network 105 to replication system 108, which replicates the updates in database 113 and/or to replication system 110, which replicates the updates in database 121. In both replication systems, what is transmitted via network 105 is updates in the form of redo-data 107. The redo-data is then stored in archived redo log files 109. Redo log files 109 are files that contain redo-data records. Redo-data records record data that the database system can use to reconstruct all changes made to the primary database 103, including changes that have not yet been committed (made permanent). For example, if a balance value in a bank_balance table changes, the database system generates a redo-data record containing a change vector that describes the change to the database. When the redo-data is used to recover the database system, the database system reads the change vectors in the redo-data records and applies the changes recorded in the vectors to the database.
In replication system 108, redo log files 109(i) are applied at 111 against physical standby database 113. Physical standby database 113 provides a physically identical copy of primary database 103, with on-disk database structures that are identical to the primary database 103 on a block-for-block basis. The database schema, including indexes therein, is the same. A physical standby database 113 is said to be synchronized with the primary database when all of the redo data produced by the primary database has been received in replication system 108.
In replication system 110, redo log files 109(ii) are applied against logical standby database 121. Logical standby database 121 contains the same logical information as the primary database 103, although the physical organization and structure of the data can be different.
An Oracle database system 101 using Data Guard can be run in three distinct protection modes:                Maximum protection        This mode offers the highest level of data protection. Redo-data 107 is synchronously transmitted (SYNC) to standby database system 108 or 110 from the primary database 103, and transactions are not committed on primary database 103 unless redo-data 107 is available to at least one standby database 113 or 121 configured in this mode. If the last standby database system configured in this mode becomes unavailable, processing stops on primary database 103. This mode guarantees no data loss because the primary database 103 and standby database 113 or 121 are, and remain, synchronized with each other with respect to the redo-data that is available to each.        Maximum availability        This mode is similar to the maximum protection mode, including the guarantee of no data loss at least so long as primary database 103 and standby database 113 or 121 remain synchronized with each other with respect to the redo-data that is available to each. However, if standby database system 108 or 110 becomes unavailable (for example, due to network connectivity problems), processing continues on primary database 103. Thus the primary and that standby are no longer synchronized with each other—the primary has generated redo-data that is not yet available to the standby. When the fault is corrected, standby database 113 or 121 is resynchronized with primary database 103. If there is a need to failover before the standby database is resynchronized, some data may be lost.        Maximum performance        This mode offers slightly less data protection to primary database 103, but higher potential performance for the primary than does maximum availability mode. In this mode, as primary database 103 processes transactions, redo-data 107 is asynchronously transmitted (ASYNC) to standby database system 108 or 110. The commit operation on primary database 103 does not wait for standby database system 108 or 110 to acknowledge receipt of redo-data 107 before completing write operations on primary database 103. If any standby destination 113 or 121 becomes unavailable, processing continues unabated on primary database 103. There is little impact on primary database 103 performance due either to the overhead of asynchronously transmitting redo-data or to the loss of the standby.        
In Oracle Data Guard, automatic failover is termed Fast-start Failover or FSFO. Configuring a replicated database system 101 for Fast-Start Failover requires that the database administrator perform a series of discrete steps:                1. Upgrade the protection mode of the database configuration to be maximum availability;        2. Configure flash recovery areas for all databases in the configuration;        3. Enable flashback logging on all databases in the configuration;        4. Create standby redo log files for all databases in the configuration;        5. Change the log transport mode of the failover target standby database to be synchronous (SYNC);        6. Restart the primary, the standby, or both databases;        7. Enable Fast-Start Failover in the Data Guard configuration;        8. Configure the Oracle Net for database communications;        9. Set the Fast-Start Failover Threshold value;        10. Start the Fast-Start Failover observer process.        
The steps to configure a database configuration for automatic failover are error prone, require a large amount of time, and require manipulation of the databases by hand using SQL*Plus or other programmatic interfaces. What is needed is an easy technique for configuring a database system with replicated data in a plurality of standby databases for automatic failover. It is an object of the invention to provide such a technique.