Database information is, typically, both created and examined. Some database users create data, while other users may merely examine data. For example, an automatic teller machine ("ATM") typically adds or modifies database data, while a trend analyst seeking marketing profiles or fraud evidence, generally examines data created by others.
Databases are often replicated to reduce contention or access to a primary database or provide stand alone work systems and spaces. Replicated databases provide work fields that allow users and clients to create or inspect data without limiting access by others to a primary database. For clients interested in only specific aspects of the primary database, replicas of particular regions or fragments can be provided to avoid absorbing excess resources. Replicated databases also provide a backup in the event of media failure.
Database consistency becomes an important issue when multiple replicas exist simultaneously with a primary database. The replicas must be updated to predictably reflect changes entered in the primary database.
A variety of techniques have been employed to maintain database consistency. Any update technique should demonstrate reasonable throughput and consume limited resources. Amending replica databases to reflect changes entered in a primary database should not unduly limit either replica or primary database access.
Copy replication has been employed to ensure predictable database consistency. Although resource consumptive, copy replication is useful when extensive record changes have been made. In copy replication, copies of the primary database are mapped into database replicas associated with particular users or clients that seek access to the information of the primary database. A copy of the primary is created and is typically applied to the replica database through the local database management system controlling the local replica database. In accordance with a predetermined schedule, database access is locked and the replica is overwritten with the data of the updated primary. The replica database thus reflects changes entered by other clients and users on a predictable basis. Copy replication typically results in inefficient use of system resources and adversely impacts database availability while consuming significant amounts of processing resources and time. A significant flow of messages is required to implement copy replication and maintained, as well as changed, database records are overwritten during copy replication.
Update replication is another known technique employed to manage database consistency. Update replication is more flexible than copy replication and typically updates only records that change data. It can also be structured to update only records of interest to particular database users or clients.
In one known type of update replication, a log skimmer monitors a transaction record log for data changes. When a change of interest is identified, the change is imposed on the target database through the local database manager. When a client is interested in only periodic, rather than immediate updates, applicable changes are queued in structured query language ("SQL") statements corresponding to a sequence of transactions. The channel between the queue and the application process that specifies the changes is blocked until indicated by the update schedule. When scheduled, the block is removed and the application program sequentially specifies a transaction and corresponding subject record to the local database manager which, in response, brings the applicable data page of the target database out of storage for processing in a buffer according to the specified transaction.
Once updated, the processed page is over-written into the appropriate area of the target database. This process continues down through the pending queue. If numerous changes are queued, the update process consumes, therefore, significant resources.
Although more precise than copy replication, update replication is a complicated system and employs extra processes not required for copy replication. Update replication also serializes changes and suffers, therefore, from wasted handling overhead and transactions.
The serialized transactions corresponding to queued changes to a particular record do not efficiently express the cumulated impact of the individual transactions on the particular record. For example, numerous queued changes to a particular record could, at the conclusion of the sequence of changes, leave the record unchanged or even deleted. Nevertheless, update replication executes the sequence of operations corresponding to the queued changes despite an unchanged or deleted end result. Consequently, when a record is ultimately unchanged or deleted by a series of changes, the intervening corresponding sequence of individual transactions has wasted system resources.
Multiply queued changes to a particular record are not organized, rather than organized in correspondence with the physical storage organization of the database, further wasting processing effort. A particular page may be brought out of storage and overwritten in the target multiple times during an update because the changes are not organized according to physical storage location.
The multiple queued transactions implicit in update replication consume significant resources. A "transaction" is a unit of work performed by an application program. In one transaction interface technique, as an example of the processing undertaken in database page handling, a transaction log is maintained in nonvolatile storage. Every completed transaction is recorded as a log record including UNDO and REDO components. The UNDO component records a database record before it is changed by a transaction and the REDO component of the log record is a copy of the record after the change has been imposed. Each record update results in the writing of an UNDO and REDO record in the transaction log. Assuming completion of the transaction which updates a particular record, the database manager will copy the requisite page into the work buffer where the recovery process will use the transaction log REDO record to update the record of interest. The local database manager will then write the updated page to the target database. The creation of both REDO and UNDO records allows backward recovery to restore records to their prior state if a transaction is aborted. If a transaction associated with a REDO record aborts, the record data from the UNDO record is logged as a REDO record and is applied to the database page to back-out the original update. The log record can also contain a COMMIT field indicating the successful conclusion of the associated transaction. Consequently, the database manager is significantly burdened if required to implement a lengthy queue of transactions implicit in the queued update replication process.
Consequently, what is needed is an efficient and robust update system that reduces update replication demands on system resources. The update method should also be flexible enough to update specific records and readily adaptable to a variety of organizational or system configurations without undue administrative or maintenance effort.