Replication is the process of sharing database objects and changes to those objects among multiple databases. As examples, the changes may be data manipulation changes (“DML changes”) or data definition changes (“DDL changes”) made by a database server. The changes might also be made through procedures. A typical log-based replication system involves at least two databases—a source database and a destination database, and has at least three components—a capture process to read changes from log files at a source database, a propagation process to send changes to a destination database, and an apply process to consume the changes at the destination database.
The term “capture” or “capture events” is used herein to refer to selecting events (e.g., changes) and adding messages to a staging area for subsequent processing. A capture process may scan a redo log for DML or DDL changes and stage them into a queue for subsequent propagation to other database servers. The queued messages are captured and propagated to other database servers according to rules provided by the user. Instead of individual DML/DDL changes, the capture process might also send, to the destination, the invocation of a procedure.
The phrase “propagate a message” is used herein to refer to the activity of distributing messages generated by a capture process to other nodes for subsequent processing. The other node may be another database server, but in some cases could be the same database server. Message propagation may entail distributing messages from a staging area for one node to another staging area for subsequent processing at another node. A propagation process propagates messages. Propagating an event refers to propagating a message about the event.
FIG. 1 is a block diagram depicting an example information sharing environment used to propagate database changes among a network of database servers. Referring to FIG. 1, it shows messaging system 101. Messaging system 101 includes three nodes, database server New York, database server Pittsburgh, and database server Chicago. Database server New York has a table sales. DML and DDL changes (e.g., updates, deletes, and inserts) made by database server New York are propagated via messaging system 101 to database servers Pittsburgh and Chicago so that changes may be reflected in their respective tables sales (not shown).
Database servers Pittsburgh and Chicago do not receive all DDL and DML changes made to table sales in database server New York. The particular changes propagated to database servers Pittsburgh and Chicago depend on the value of a column city (not shown). Changes to rows where the value of city equals ‘Pittsburgh’ are propagated to database server Pittsburgh; changes to rows where the value of city equals ‘Chicago’ are propagated to database server Chicago.
A capture process 103 captures changes to table sales at database server New York and, for each change, adds a message to message queue 105. Capture process 103 captures the changes by scanning a redo log (not shown) maintained by database server New York. A redo log contains records that specify changes to rows in tables maintained by database server New York. Capture process 103 scans the redo log for records specifying changes to rows in table sales, adding a message to message queue 105 for changes to a particular row.
Propagate process 107 propagates messages queued in message queue 105 to database server Pittsburgh and propagate process 113 propagates messages queued in message queue 105 to database server Chicago. Messages reflecting changes to a row having city value equal to ‘Pittsburgh’ are propagated to database server Pittsburgh. Messages reflecting changes to a row having a city value equal to ‘Chicago’ are propagated to database server Chicago. For simplicity, the condition in this example does not list details to handle row subsetting (or row migration) or assumes no row migration.
Apply processes 155 and 156 apply the propagated changes in the servers in Pittsburgh and Chicago. The term apply process refers to a process for reading messages in a staging area, such as a queue message queue 223, and causing the database server to take some action to reflect the event described by a message. For example, a message in message queue 223 that reflects a DML change made to a table in database server New York is applied to database server Chicago by making a corresponding change to a corresponding table in database server New York. Applying an event or change refers to applying a message about the event. Processes that capture, propagate, or apply events are referred to herein as message flow processes or flow processes.
Rules are used to decide what changes to capture and what messages to propagate. To determine what changes to what particular rows are captured, capture process 103 executes capture rules 109. The condition for a rule in capture rules 109 is expressed as a SQL expression that can reference a variable specifying the “source table” affected by a change or a column of the table. The following predicate expression is used in a rule to capture changes to source table sales.
source_table=“sales”
Propagate processes 107 and 113 evaluate propagate rules 111 and 115, respectively, to determine how to propagate a message from message queue 105. A rule condition that specifies to propagate to database server Pittsburgh messages reflecting changes to a row with city value equal to ‘Pittsburgh’ may be expressed using the following predicate expression.
source_table=“sales” AND
city=“Pittsburgh”
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.