Database replication refers to the electronic copying of data from a database in one computing system to a database in another computing system, so that data can be accessed from each of the computing systems and at, possibly, different geographic locations in parallel and so that data processing can continue despite outages or disasters, natural or otherwise, at individual systems and locations.
Thus, replication of data is often used to improve availability of the data to database software in case of system and communication failures, as well as more serious disasters, such as earthquakes or intended attacks. However, replication of data typically requires ensuring that all copies of the replicated data are kept consistent and up-to-date, except, possibly, for some small delay or replication latency. Most data replication techniques perform operations or data modification statements against some copy of the replicated data, and then capture the effects or changes of transactions against the replicated data and transmit these changes to other copies of the replicated data. With database languages (e.g., Structured Query Language (SQL)), however, a very simple data modification statement or transaction may result in a large volume of changes (e.g., to millions of database records), which causes replication latency to suffer because of the delay involved in transmitting this large volume of changes.
An alternative means of keeping replicated copies of data consistent is to transmit the actual data modification statements or operations, packaged into transactions to be applied to the database copy or copies. This statement-based approach to replication may reduce the communication overhead of transmitting changes and the amount of intermediate storage where that is needed for recording changes, thereby reducing replication latency. That is, statement-based data replication consumes less storage and bandwidth and reduces latency, especially for very large data warehouses. On the other hand, statement-based data replication is generally understood to be able to maintain consistency of replicated copies only when both of the following restrictions apply:                Determinism: The same data modification statements in a transaction are interpreted in exactly the same way at all replicated copies. Replicated data modification statements cannot refer to random-number generators, for example, or to non-replicated data that may not be present at or may be different at different replication nodes.        Serializability: The end effect of executing a collection of transactions has to be identical to the effect of executing those same transactions one at a time in some serial order. With statement-based data replication, the transactions appear to have executed (this does not imply actual serial execution) in the same serial order at all copies, in order to ensure that execution of the data modification statements in each transaction have identical effect on all copies.        
These constraints generally need to be enforced by statement-based data replication systems. However, transaction serializability implementations, in those database systems that provide this, are generally expensive and inhibit transaction concurrency and throughput.
Thus, most database software has learned to live without transaction serializability, using instead weaker but more concurrent transaction isolation levels, such as Read Committed and Snapshot Isolation, and dealing with consistency concerns through corrective action, or compensation, in the application. These weaker isolation levels have not to date been usable with statement-based replication because of the generally accepted serializability restriction noted above.