1. Field of the Invention
The present invention relates generally to data recovery in data processing systems employing a database. More particularly, the present invention relates to a method and apparatus for providing processing system availability during database management system restart recovery without loss of data integrity.
2. Description of the Related Art
Data processing systems typically manage large amounts of customer data and data generated by users within the data processing system. For many businesses, any loss in the ability to access data severely impacts the success of the business. Indeed, it is deemed catastrophic if the data is unavailable for a prolonged period of time. Thus, when system failures occur, the system must restart rapidly to minimize data outage.
Much business data is stored in databases, under the management of a database management system (DBMS). When a DBMS is restarted after system failure, part of the DBMS's restart function is to attend to work that was interrupted by the failure. This function is typically called restart recovery. It involves ensuring that committed work is not lost as a result of the failure, and ensuring that uncommitted work that persists the failure is undone.
In DBMSs that employ write-ahead logging (WAL), committed transactions are typically recovered by performing REDO (forward) processing of the log during restart recovery: uncommitted transactions are backed out by performing an UNDO (backward) processing of the recovery log. A further discussion of write-ahead logging, UNDO, REDO and other commonly known recovery methods may be found in U.S. Pat. No. 5,333,303 issued Jul. 26, 1994, entitled, "METHOD FOR PROVIDING DATA AVAILABILITY IN A TRANSACTION-ORIENTED SYSTEM DURING RESTART AFTER A FAILURE," assigned to the assignee of the current invention and incorporated herein, and in the following references:
1. C. J. Date, "AN INTRODUCTION TO DATABASE SYSTEM", Vol. 1, Fourth Edition, Addison Wesley Publishing Company, Copyright 1986, Chapter 18; PA1 2. H. F. Korth et al., "DATABASE SYSTEM CONCEPTS", McGraw-Hill Book Company, Copyright 1986, Chapter 10; PA1 3. C. Mohan et al., "ARIES: A TRANSACTION RECOVERY METHOD SUPPORTING FINE-GRANULARITY LOCKING AND PARTIAL ROLLBACKS USING WRITE-AHEAD LOGGING", IBM Research Report R J 6649 (63960), Jan. 23, 1989, Revised Nov. 2, 1990; PA1 4. C. Mohan, "COMMIT_LSN: A NOVEL AND SIMPLE METHOD FOR REDUCING LOCKING AND LATCHING IN TRANSACTION PROCESSING SYSTEM", Proceedings of the Sixteenth VLDB Conference, August 1990; and PA1 5. C. Mohan et al., "TRANSACTION MANAGEMENT IN THE R* DISTRIBUTED DATABASE MANAGEMENT SYSTEM", ACM transactions on Database Systems, Vol 11, No. 4, December 1986, pgs. 378-396.
DBMSs that do not use WAL typically employ force-at-commit protocols. In these systems, the result of work completed is held in cache and the resultant data is not updated until the work has been fully committed. With force-at-commit systems, it is not necessary to perform REDO processing; UNDO processing is sufficient.
In both WAL and non-WAL systems, a DBMS's restart recovery processing can take a considerable amount of time. This is especially true if one or more long-running transactions were active at the time of failure. This is because the operations performed by the transactions must be backed out and prior art DBMSs do not traditionally start to process new transactions until all data recovery processing has been completed. Hence the data outage can be large.
What is needed is a method and apparatus for performing a DBMS restart recovery with partially delayed UNDO processing that provides processing system availability. The method would allow a portion of the restart recovery to be postponed until after the DBMS involved begins accepting new work (transactions). Further, the method and apparatus should ensure data consistency during recovery, where required, to preserve data integrity.