Database synchronization may be carried out utilizing various tools and methods. Numerous log-based replication tools replicate data in one database into one or more replicas. This is accomplished by reading committed transactions from a transaction log of the database in which the update is made, and performing the same updates in all of the replicas in the network. Depending upon the vendor and configuration options, the updates are either always made at a primary site and then propagated to the replicas, or the updates are made at any site and propagated to all other sites. To achieve high availability, these systems often employ a primary-standby primary scheme, where the standby primary is a replica that becomes the primary in the event of the failure of the original primary.
Unfortunately, a number of events can and often occur which prevent conventional database synchronization from being highly available. For example, an error may result from synchronization procedures due to the use of communication pathways which introduce errors into the data or losses of data which is being passed between replicas of the database. Most often, an amount of these errors and losses is proportional to a speed with which synchronization is executed. Errors can often be rectified by reducing the speed with which the synchronization is executed, or executing a re-synchronization or repair tool, etc.
In some cases, the data that has not synchronized properly is of only a trivial importance. As such, the benefits of running database synchronization at accelerated speeds would outweigh the cost of any data loss. Unfortunately, there is currently no way of quantifying the cons corresponding to data loss, and making any type of decision based thereon.
There is therefore a need for a technique of quantifying the loss of data during database synchronization for the purpose of increasing a speed of the synchronization to a maximum threshold.