1. Field of the Invention
The present invention relates to databases, and more particularly to database replication technology.
2. Background Art
Data replication is the process of maintaining up-to-date and multiple copies of a database object in a distributed database system. Performance improvements, and in some cases higher security of data can be achieved when data replication is employed, since multiple access locations exist for the access and modification of the replicated data. For example, if multiple copies of a data object are maintained, an application can access the logically “closest” copy of the data object to improve access times and minimize network traffic. In addition, data replication provides greater fault tolerance in the event of a server failure, since the multiple copies of the data object effectively are online in a distributed system if a failure occurs.
Different solutions exist to obtain data from a source of modifications, for example a primary database, and provide the data to a replicate or target database. In some cases, data may be replicated at a different intervals by obtaining a “snap-shot” of a source of data or a “snap-shot” of modifications to source data that is to be replicated. replicate databases and the replicate or target database is updated only after a source at the primary database has been modified. Therefore, replication on the target database will occur after a delay of time, known as latency. An asynchronous replication solution can use different methods to transfer replication information. These methods include log based replication and statement based replication.
Log based replication involves storing a result of a data manipulation language (DML) statement into a transaction log. A process may then read the transaction log to extract and send information associated with the result to a replicate or target database. Statement replication includes transferring a data modification language statement itself to a replicate or target database in a way that data between a primary database and the replicate database continues to be in synchronization.
The results of a statement executed in the source database and replicate databases can be different depending on a replication architecture. For example, if data on a replicate database is a subset of data on the primary database, the same statement may affect a different set of data when it is replicated from the primary database to the replicate database. In such cases, SQL DML replication will result in data at the primary and the replicate database being out of synchronization.
Therefore, what is needed is a system, method and computer program product that replicates SQL DML statements in a manner that allows consistency between data in a primary database, and one or more replicate databases and overcomes performance issues are associated with result (row change) based replication.