1. Field of the Invention
The present invention relates generally to databases and, more specifically, to data replication.
2. Description of the Background Art
Data replication involves the copying and synchronization of data from a source database to one or more target databases. At present, data replication is commonly accomplished using log-based technology. Through the use of log-based replication, changes to the data source are captured in a transaction log. A replication process reads the transaction log and propagates the changes to target databases.
The data in a transaction log is not dependent on the execution context, instead providing information on what rows in what tables of a database have changed, as well as the new data contained in these rows. The replication process is able to instruct the target databases on the specific changes they need to make to the local copies of the changed tables in order to have an exact copy of the source data.
Log-based replication works well, in terms of latency, when transactions are small and any changes made involve few data rows. However, certain operations result in large transactions with changes to many data rows. For example, batch jobs or scheduled operations to the source database will often contain queries resulting in changes to thousands of data rows. Using log-based replication, each of these thousands of changes to the data rows must be applied to the target databases one at a time. The replication process is required to read and forward every atomic operation with its data to the target server, resulting in a costly log scanning effort and high network usage. Moreover, this results in a severe performance impact to the target databases as they work to apply the individual changes. Rather than executing the single source statement that was run on the source database, one statement for each affected row is run on the target database. This may lead to the target database suffering from asymmetric resource loading.
Accordingly, what is desired is a means for performing database replication with reduced overhead when processing high-impact transactions.