Data replication in relational or hierarchical databases is increasingly important as databases are deployed more and more in distributed environments. The goal of data replication is to maintain one or more copies of a source object in the database across servers, possibly on different platforms and geographical locations. One method of data replication is log-based asynchronous replication. A database log records all changes to the tables in the database. Changes are captured from the database log outside of a commit scope of the original database transaction.
FIG. 1 illustrates components of a conventional asynchronous replication process. The database system includes a plurality of nodes 10–12, each node being a “member” of a replication group, i.e., a table copy is kept at each of these nodes 10–12. At each node 10–12 is a Capture program (“Capture”) and an Apply program (“Apply”). The Capture and the Apply each maintain control tables at the node. Control tables are database tables used to store all replication information persistently. They are read and updated by Capture and Apply. The node at which changes are made is the source node. The node at which the changes are to be replicated is the target node. Message queues 41–43 is the mechanism used for transporting messages between the nodes 10–12.
During the replication process, Capture reads the database log for committed changes at the source node. The database log contains the source table and row that was changed, the type of operation, the column data type information, the data value after the change for insert and update operations, and the data value before the change for delete and update operations. These changes are then formatted into messages and sent to the message queues to the target node. Upon delivery to the message queue, Apply retrieves the messages and applies the changes to the target table. In the illustrated system, changes are allowed to be initiated at any table copy. This type of replication has been variously called “multi-master”, “peer-to-peer”, and “update anywhere” data replication.
The propagation of changes made to one table copy may be synchronous or asynchronous to the original change. Synchronous propagation makes changes at all table copies as part of the same transaction that initiated the original changes. Synchronous change propagation requires that the database management systems maintaining all or most table copies be active and available at the time of the change. Also, synchronous change propagation introduces substantial messaging and synchronization costs at the time of the original changes. Asynchronous propagation copies the original changes to the other table copies in separate transactions, subsequent to the completion of the transaction initiating the original changes. Thus, asynchronous change propagation is sometimes more desirable due to its savings in overhead costs.
Before asynchronous replication of data can begin, the replication group of member nodes is first initialized. Also, occasionally, new members are to be added to the replication group or an existing member is to be removed from the replication group. The challenge is to provide these functionalities without significantly and adversely affecting performance.
Accordingly, there exists a need for a method and system for member initialization to and deactivation from an asynchronous data replication group in a database system. The method and system should allow new members to be added to the replication group or existing members removed from the replication group, without requiring the halting of the asynchronous replication of data. The performance advantages of asynchronous replication should still be realized during member initialization or deactivation. The present invention addresses such a need.