FIG. 1 illustrates a first database at a first physical site 100. The database includes original transactions for the first database 102 at site 100. The database also includes replicated transactions for a second database 104. The replicated transactions are for the purpose of disaster recovery.
FIG. 1 also illustrates a second database at a second physical site 106. This database includes original transactions for the second database 108 at site 106. The database also includes replicated transactions for the first database 110 at site 100.
A replication coordinator 112 operates between the first database 100 and the second database 106. The replication coordinator 112 collects Change Data Capture (CDC) events for the original transactions for the first database 102 and generates write commands to form the replicated transactions 110. Similarly, the replication coordinator 112 collects CDC events for the original transactions for the second database 108 and generates write commands to form the replicated transactions 104.
Prior art systems of the type shown in FIG. 1 operate on the assumptions that what is supposed to happen will happen. There are no mechanisms to track data “in flight”. That is, the system of FIG. 1 is typically implemented across a network with the first database 100, the second database 106 and the replication coordinator 112 potentially being on machines across the globe. As data moves over networks traversing great distances, any number of mishaps may occur that result in the loss of data. Data loss is most often addressed via manual audits of the database or via packaged solutions executed in batch mode. Both of these approaches introduce significant cost and large latencies. In particular, manual audits are very expensive. Batch mode operations, usually performed after hours or on weekends, can only spot problems hours or days after they occur. Therefore, they cannot spot intermediate inconsistencies. This causes problems that can compound as subsequent transactions update inconsistent states of database tables. Depending upon the nature of the data, data loss can have catastrophic consequences. In many industries it is essential that data inconsistencies be identified immediately to avoid severe operational and financial impacts.
In view of the foregoing, there is a need for a non-intrusive real-time verification mechanism that continuously and incrementally monitors replication solutions to ensure full replication of data between databases.