The present invention relates to the field of data replication.
“Database Replication” is specified as the application of database deltas (i.e., the results of transactions being performed against a database) from one database in a pair to the other one. Transaction I/O (e.g., inserts, updates, and deletes) applied to one database are applied to the other database, and vice-versa (for the case of bidirectional database replication). Both databases are “live” and are receiving transactions from applications and/or end users. U.S. Pat. No. 6,122,630 (Strickler et al.), which is incorporated by reference herein, discloses a bidirectional database replication scheme for controlling transaction ping-ponging.
“Loading” a target database consists of taking a copy or subset of the source database information, optionally transforming or filtering it, and saving it into a target database. The goal of the loading process is for the source and target databases, or the subset of the data in the databases, to be synchronized (i.e., to match based on the range and transformation/filter logic) at the end of the operation.
“Offline” loading is defined as loading a target database when the source and target are offline, i.e., not accessible to applications. Many loading techniques are available for this form of loading, such as:
(1) Database backup and restore (using disk, tape, etc).
(2) Direct data file duplication and copying (e.g., using a disk utility to copy the file structure and/or contents from the source to the target, such as FUP DUP on a Compaq NSK system).
(3) FTP, etc.
Sometimes, these same offline techniques can be used if the source and/or target database are accessible to applications for read-only purposes. In some cases, these techniques (such as a FUP COPY SHARE option on a Compaq NSK system) can even be used if the source or target is open for update by applications, although the efficiency of the operation typically degrades.
Approaches to offline loading typically leverage the non-concurrent aspects of using “bulk” methods for the extract and/or apply phases, and the target is usually unavailable to applications during the sequence. Additionally, for those approaches that allow the source and/or target to be accessible to applications during the operation, the offline approaches do not resolve and include any updates that are occurring during the operation into the target.
“On-line” loading is defined as the ability to perform loading of a target database when either the source or target database, or both, are being actively accessed by applications (on-line transaction processing (OLTP), batch, etc.). Loading can be of the entire target, a subset of a target, or checking and resynchronizing a key range in the target.
There are several types of on-line loading—loading when the source or target is offline (i.e., inaccessible to applications), loading when the source and/or target is on-line for read-only application access, loading when the source and/or target is on-line for full update application access, or any combination of the above with some form of concurrent application access to either or both databases.
Shadowbase™ (a commercially available product made by ITI, Inc., Paoli, Pa.) allows for on-line (that is, application level) access to the source and target databases when both the application and replication are active. Older schemes, including ITI's TMF Auditor and Migrator products, and U.S. Pat. No. 5,884,328 (Mosher, Jr.) disclose schemes for loading an active source to an inactive target. Loading an inactive target typically allows for a significantly different approach using more bulk-oriented methods (the classic concurrency vs. speed tradeoff).
Some desirable and important goals of on-line loading are summarized below:
(1) Design solutions should transcend the platforms, environments, and databases, i.e., solutions should work on both the NSK platform and on the ‘open’ sides (e.g., Unix, NT, VMS, etc), and any combination thereof (NSK→NSK, NSK→Open, Open→NSK, Open→Open, etc).
(2) Design solutions should work on both “audited” and “non-audited” databases. Audited databases support transactional integrity. That is, a set of I/O operations is either fully committed, or fully backed out, based on the success or failure of the transaction. Audited databases typically provide a transaction “redo” log that can often be read by the replication engine for transaction I/O order. The collection of the I/O's in a non-audited database must be managed and made available to the replication engine (see discussion below).
(3) Design solutions should transcend the database types, whether legacy (e.g., Compaq Enscribe), relational (e.g., SQL), hierarchical (e.g., VSAM/ISAM), network, or other. Throughout this disclosure, file/record/field and table/row/column are used interchangeably.
(4) The design solutions should allow for on-line loading, e.g., via Shadowbase technology, properly sequenced with OLTP transactions against the file. That is, at every point in time, the target should be properly synchronized with the source for the data ranges that have been already loaded. Some of the schemes described in the present invention may violate this constraint for brief periods during loading (i.e., “referential integrity” may be briefly violated). More specifically, when using certain designs, for example those that do not lock the source data (e.g., BROWSE access that reads “dirty” data), and those that only briefly lock the source data intra-record (e.g., STABLE access that only holds a lock for the time the read takes to complete), violating referential integrity is a transient problem in the target only until all transactions that were active when the block that was loaded have terminated, or until the replication stream catches up to the point where the intra-record lock occurred. During the period of referential integrity violation, the replication engine may receive errors when it replays the inserts, updates, and deletes into the target database. During these times, the replication engine will typically automatically enter a “fuzzy replication” mode, where it allows mapping the failed operations into ones that will succeed. This includes mapping a failed insert into an update due to a “record already exists” error, mapping a failed update into an insert due to a “record not found” error, and mapping a failed delete into a no-operation due to a “record not found” error.
(5) Generation of audit/log file data should be minimized for the data being loaded. That is, the data being loaded should not require intermediate storage (e.g., in an audit trail or transaction log file). Instead, the data should flow directly to the target table, possibly via some intermediate replication engine processes or functions.
(6) The design solutions should not require “audit compression” or a similar feature to be turned off or disabled (i.e., do not require that audit data compression be turned off for audit trail data). Audit compression is a technique for saving just those field or record components that actually changed. Audit compression is thus a desirable feature to keep on because it minimizes the amount of data saved in the audit trail.
(7) The design solutions should allow loading of the target to be done in parallel, only loosely coupled with the on-line updates (i.e., multiple loading threads can be run in parallel, and these are only recombined with on-line updates when necessary).
(8) For those replication environments that implement filtering and/or data transformations to the data being replicated, the design solutions should only require one set of user exits/mapping configuration, meaning the loading path uses the same user exits/maps as the on-line path. Depending on the approach, these same user exits/maps may be bound into multiple replication components.
(9) The data should be able to be loaded the ‘correct’ way the first time, meaning as an insert when it should be an insert (rather than an update that fails and is mapped into an insert). This should decrease loading overhead somewhat.
(10) The design solutions should be extremely fast and efficient, approaching ‘raw’ or bulk loading rates when the source and/or target is offline, or when the target database supports a bulk-style insert operation even when the target is on-line. This is especially true for schemes that use APPEND-type bulk insert operations.
(11) The design solutions should allow the user to select the data range(s) to be loaded, and should be fully restartable.
Accordingly, there is an unmet need for a synchronization scheme in a database replication system that does not suffer from the negative aspects discussed above, and which is fast, relatively simple to implement, and efficiently uses communication medium. The present invention fulfills such a need.