Many of today's computing applications require extreme reliability. The requirement for average failure intervals measured in centuries is not uncommon. The solution of choice in today's art to achieve such reliability is to build systems with redundant components so that the functions of a failed component can be assumed by a surviving component.
In many cases, redundancy is provided at the system level. Two or more functioning systems are provided as nodes in an application network. If one node fails, its services can be provided by another node.
A major issue with these systems is that two or more databases must be provided across the application network so that, if a database is lost, the network still has access to the current application data. These databases must be kept in synchronism. That is, when a change is made to one database, it must be reflected in all database copies across the network (the database copies need not be exact replicas of each other; for instance, the data may be in different formats, or the data may be transformed or filtered differently by different database copies).
Database synchronization is often provided by data replication. When a change is made to one database, that change is sent to the other database copies in the network.
An important issue with data replication is replication latency. There is a time delay between the time that a change is applied to one database and then subsequently replicated to the other databases. Not only does this time delay give a different view of the application state to users at different nodes at any specific point in time, but more importantly it can lead to data loss and data collisions. That is, if a node fails, any changes in its replication pipeline may be lost. Furthermore, it is possible for two users at different nodes to update the same data at the same time—a data collision. The longer the replication latency, the greater the amount of data that may be lost following a failure and the more likely it is that data collisions will occur.
Statement Caching is a method that can significantly reduce the replication latency of a data replication engine, thus reducing to a great extent inconsistent views, data loss, and data collisions in multi-nodal computing systems. It does this by intelligently caching database change statements (the statement text portion as described below) so that each needs only be sent once. Thereafter, only the statement data required by a database change statement need be sent.
In prior art replication engines, the statement caching information was not persistent—the caches started out empty each time replication engine began replicating (e.g., after a restart). This caused unnecessary data traffic to be sent as the caches need to be repopulated (with information that had already been sent) to the point where they were when the last replication engine shutdown occurred. There is a need in the art to combine persistent storage of the statement text and/or statement data to the statement caching architecture to make the replicator more efficient and to lessen replication latency. The present invention fulfills such a need.