The present disclosure relates to the field of relational databases and more particular to data replication.
Replication is a mechanism to copy data between multiple database systems. A variety of replication products exists to cope with the plurality of data replication techniques available and the plurality of requirements of different enterprises. Currently, there exist tools for regularly executing a full-backup and/or for executing incremental backups, such as by means of snapshot technologies. Data replication may be executed for backup purposes and for storing data redundantly on multiple machines in a cloud environment to provide said data to a plurality of clients more quickly (increased processing power by storing the same data on multiple machines redundantly) and/or more reliably (in case one database server fails, another one may take over immediately).
In any case, replication should enable that a copy of the source data represents a consistent state of the source database. In turn, replication should ensure that no data written to the source data during the copying process is lost and that the copy can be synchronized later, e.g., by incremental backups, with the source data.
To ensure consistency between the source data and the copy of the data generated by replication, current replication tools perform disruptive operations to ensure consistency of the source data at the moment when replication starts. The current replication tools try to achieve a point in time where there are no open transactions on the source data by creating a read lock on the complete source data at the moment when the replication process starts. The read lock prohibits performing any write transactions on the source data until the replication process has finished. In the meantime, all transactions to perform a write on the source data are queued. Said queuing is disadvantageous as high latency times for individual write operations on the source data may be caused. Thus, current data replication approaches often result in a significant performance reduction of the source database during an ongoing replication process.