1. Technical Field
This invention relates to a method and system for reintegrating a failed primary instance into a log shipping data replication system. More specifically, the invention relates to preserving consistency between the failed primary instance and a second instance in conjunction with the reintegration.
2. Description of the Prior Art
Large modern database management systems are commonly found in terabyte sizes on systems containing hundreds of CPUs and thousands of spindles. A history of changes made to the database is stored in a database log. In general, recovery techniques associated with database recovery reset a system or data in a system to an operable state following damage and provide a process of rebuilding databases by restoring a backup copy and rolling forward the logs associated with the backup copy.
In log shipping data replication, a primary instance of a database transfers copies of its log records to a secondary database instance where the logged operations are replayed. The secondary database instance is typically unavailable for update during normal operations, but is available to become the new primary instance in case of a failure of the original primary instance. After a primary instance fails, it may be successfully restarted at a later time. However, if during the time interval from when the original primary instance fails until it is later restarted the secondary becomes the new primary, the two copies of the database need to be synchronized to avoid maintaining two separate yet inconsistent copies of the database.
FIG. 1 is a block diagram (10) conceptually depicting a primary data processing system (30) operational with a standby data processing system (50). The standby data processing system (50) is initialized from a full copy of the primary data processing system (30). The primary data processing system is hereinafter referred to as a primary instance. Similarly, the standby data processing system is hereinafter referred to as a secondary instance. There may be multiple standby instances, but for illustrative purposes only one is shown. The standby system (50) and associated databases can also be considered secondary instances, and the terms can be used interchangeably. The primary instance (30) includes a database agent (32) for performing updates to database pages in memory (not shown). Updates to database pages remain in the database page cache (34) until a record of the update is written to disk (38) in a local copy of the database log file by a page cleaner (36). The log file (not shown) is a record of all changes made to the primary instance (30). Log records are first created and accumulated in the memory pages of log buffers (40). The log records are periodically written to disk (44) by a log writer (42). When a log shipping data replication subsystem is deployed, log pages are periodically accessed either from the log buffers in memory (40) or by reading the pages into memory from the database log disk (44) and sent over a communication network (48) to a standby instance (50). The log data is sent by a log shipping agent (46), or a log writer (not shown) of the primary instance (30) to a log receiving agent (52) of the standby instance (50). The log receiving agent (52) accepts log data from the communication link (48) and stores received log data in memory pages of a log buffer (54) in the memory of the standby instance (50). Periodically, the received log pages are written to disk (58) by a log writer (56). In addition, a log replay agent (60) is provided to access the log pages from the log buffers (54) and apply the effects of the changes reflected in the log records to the copy of the database in the standby instance (50). The log replay agent (60) may replay the log records directly to disk (66), or it may replay the log pages by updating data pages in the database page cache (62) which is periodically written to a database disk (66) via the page cleaner (64). By continuously shipping log records from the primary instance (30) to the standby instance (50), and replaying the log records against the copy of the standby instance (50), the standby instance maintains a replica of the primary instance (30).
There are three possible scenarios representing how much of what constitutes the database log is present at a primary instance and at a standby instance at the point of failure of the primary instance. The primary instance can have less, the same amount, or more log data than the standby instance. Since log data is typically written first to the primary instance and then shipped to the standby instance, it is most common for the primary instance to have more or an equal amount of data compared to the standby instance.
In the cases where the primary and standby instances have an equal amount of log data or the primary instance has less log data than the standby instance, no special treatment of the primary instance is necessary in order to make it consistent with the standby instance if the original primary instance is restarted after a failover. When the standby takes over control of the database, it becomes the new primary instance. The new primary instance processes all of the received log data and begins new database operations. New log data is generated starting at the point immediately after the last log data received prior to the failure of the old primary instance. If the old primary instance is repaired and restarted, it can rejoin the log shipping data replication scheme by taking on the role of the new standby instance and have the new primary instance start shipping log data to it beginning from the next log position after its current end of log. Accordingly, in either of these scenarios the result will yield two copies of the database having identical logs, and as the log records are replayed, substantially identical databases.
In the case where the old primary instance has a greater amount of log data than the new primary instance, maintaining consistency between the two instances becomes problematic. For example, it is likely that the new primary started processing transactions even though some log data from the old primary never made it to the new primary. Accordingly, in this scenario the result will yield two copies of the database with each instance having logs which may differ, and, if the log records are all applied to each instance, inconsistent copies of the database.
There are two known methods for consistently resynchronizing a previously failed primary instance of a database system with a new primary instance. The first method restores a copy of a failed primary instance and then applies successive log data according to the log stream of the new primary instance. The successive log data may be applied by way of a rollforward recovery operation, by reversing the direction of log shipping data replication, or by a combination thereof. The process of restoring and recovering the database using the new primary instance version of log history removes any inconsistencies from the old primary instance as compared to the new primary instance. However, the first method is not always desirable, such as in a large database installation. This restore and recovery method requires a significant contribution of time and operational resources. The second method captures the position of the last complete log record received by the new primary instance at the time of takeover from the failed primary instance. When the failed primary instance restarts as a new secondary instance, an “undo” procedure is performed for all log records found in the log of the failed primary instance after the position of the last record that was successfully transferred to the new primary instance prior to the failover. However, this method assumes that all operations may be processed through the undo procedure using a normal recovery process of a database system. It is in general very complicated, and in some database management systems impossible, to implement the gamut of undo processing necessary to rectify all cases of inconsistency between a primary instance and a standby instance. For example, it is a common practice in database management systems to perform a variety of “not undoable” operations, such as recycling the disk space previously used by deleted database objects, in conjunction with the irrevocable completion of a database transaction which necessitates such an operation. Once such an operation takes place, there is commonly no means in a database management system to accurately and reliably undo its effects. Accordingly, there is a need for reintegrating a failed primary instance into a new secondary instance without performing restore or undo operations.
There is therefore a need for a method and system that supports safely truncating log records on a failed primary instance prior to reintegrating the failed primary instance with a new primary instance. The method and system should support determining when it is safe to perform a log truncation, and when the truncation can be performed for a portion of the extra log data on the failed primary instance.