In today's global economy, many critical computing systems operated by enterprises must be available continuously. They must be up and running 24 hours per day, 365 days per year. In order to achieve such availability, redundancy is required. An enterprise must protect itself from the failure of a critical system by having another operational system that it can quickly bring into service should its primary system fail or, even worse, should its data center be destroyed or disabled by some disaster. This redundant system can be a passive standby system, or it can be another node in an active/active network, in which all nodes are actively engaged in a common application. An active/active system is a network of independent processing nodes, each having access to a common replicated database. All nodes can cooperate in a common application, and users can be serviced by multiple nodes.
In order to be effective, the redundant system must have a current copy of the application database. The copy must be complete, accurate, and consistent. In order to initially create a redundant database copy, a database loading facility is typically used to copy the contents of a currently active operational database to the target database copy.
For large databases, the creation of a backup database copy can take hours or even days. During this time, it is often important that the portion of the target database that has been loaded can be used for active processing. In order for it to be useful, the partially-loaded target database must be consistent. Consistency requires that all user-defined data constraints be satisfied (if defined on the data), that every child row has a parent row (this latter condition is known as referential integrity or referential integrity constraints), and may require that every row in the database be uniquely identified by a primary key (this is often useful for enforcing referential integrity constraints). A child row is a row that has a “foreign key” that points to another row, the parent. That parent row must exist. In some databases, there are no referential integrity constraints defined. In this case, there are no child/parent relationships and no foreign key relationships to be checked or maintained during a load. In still other databases, there may be child/parent and foreign key relationships, but the database itself per se does not directly enforce these relationships (NonStop® SQL/MP, commercially available from Hewlett-Packard (HP®), is one example). In these databases, it is preferable to maintain these relationships during the load sequence in order for the target database to be maximally useful while the load occurs.
There are many methods in today's art for loading a target database from an active source database. However, these methods do not provide for either referential integrity or for the broader attribute of consistency at the target database while the load is taking place. For instance, a partially-loaded target database that does not satisfy these attributes may contain the detail lines (the children) of an invoice (the parent) that does not yet exist on the target database (a referential integrity violation). Therefore, a query that requires the invoice header information for a detail line will fail if the query is made against the target database. Alternatively, a user-defined data constraint that requires that an invoice total in the invoice header be the sum of the amounts in each of the invoice's detail lines cannot be reconciled against the detail lines if those rows do not all exist (a consistency violation).
If a partially-loaded database has a parent row for every child row and furthermore has all of the child rows associated with each parent row, and if all data constraints are satisfied for the data that has already been loaded, then the portion of the database that has been partially loaded is said to be complete. That portion of such a target database is fully usable (useful) in an application, and it accurately reflects the source database for the portion that has been loaded. In the above example, if all loaded detail lines have an invoice, if all loaded invoices have all of their detail lines, and if all data constraints are satisfied for the data that has been loaded, the partial database is complete and is typically usable by an application.
Furthermore, the target database can have consistency checking enabled during the load. This avoids the problem of having to turn off target consistency checking before the load begins and then finding that it cannot be enabled following the load because of consistency violations.
What is needed is a method of database-loading which satisfies multiple, optional levels of database correctness. The first level includes referential integrity. The second level includes consistency, which includes referential integrity (if present). At the highest level, completeness in included, which also includes referential integrity (if present) and consistency.