Copying data from a source database to a target database is performed for many different purposes, e.g., for backup purposes, for load balancing purposes or for maintaining multiple copies of the same data on different engines for optimizing the organization of the data for different kinds of queries. One deployment strategy is to have a source database that is used for all update queries and then one or more target databases that receive full or incremental updates from the source database and are used in read-only mode to provide data to users and applications.
Multiple different approaches for copying data, in particular data changes, from a source database to another relational database (referred herein as “target database”) exist.
One approach is directed to propagating a source data set to a plurality of target databases. However, the description of such an approach does not disclose the exact physical basis of the copying process.
Another approach is the so called “snapshot” technique: the object (e.g. a full table) is read from the source database and loaded (stored) into the target database. This snapshot must represent a consistent point in time of the source object. However, as soon as the source object is altered in the source database after the snapshot was taken, the source and the target object are inconsistent. To synchronize the objects in the source and the target database, it is common practice to incrementally apply changes that affect the source object to the target object using a replication mechanism.
However, executing incremental updates is not always an option, e.g., because the computational overhead for the incremental updates is considered too costly. If incremental updates are not an option, the entire source object must be copied to the target database based on a new snapshot. Such a reload may also be an expensive and time-consuming operation and typically involves the transfer of large amounts of data (which may actually have been replicated already).
Thus, many existing approaches for copying data from a source to a target database face several problems such as the computational overhead necessary for supporting incremental data replication, the large amount of data necessary for performing full snapshot replication operations and the problem of how to maintain data consistency of the replicated data changes as well as keeping the source and the target database in sync.