Replication is the ability to track, dispatch and apply data changes from one database to another. In a Relational Database Management System (RDBMS), data changes in a database are naturally logged for recovery purposes. System metadata for replication determines what data are being changed and the changes that are to be replicated. The data to be replicated may include the whole database, entire tables, selected rows from tables, specified columns from tables or stored procedures. The data changes in the log may then be marked for replication. The data changes may be split into consecutive log ranges such that each log range of data changes are replicated to the destination site in one batch. In conventional transactional replication, data changes are typically dispatched from a source site and applied at a destination site through a single connection or stream. While a single stream ensures that the data changes are applied at the destination site in the same order as the source site to maintain transactional consistency, there are throughput limitations that can occur due to this method.
In order to avoid these throughput limitations, replication is performed using multiple streams (i.e., multi-streaming) such that the data changes are applied in parallel at the destination site. This parallelism leads to higher throughput and performance for the system as a whole. When performing replication using a multi-streaming approach, the data changes within a batch are dispatched into multiple distinct streams based on a predefined hashing key. Consequently, the data changes that are dispatched in different streams may be applied at the destination site in a different order from which they were performed at the source site. Applying data changes out of order may violate one or more causal consistency constraints among conflicting data changes. Examples of causal consistency constraints include primary key constraints, unique constraints and foreign key constraints.