As pointed out by C. J. Date, "An Introduction to Database Systems", Vol. 2, Addison-Wesley Publishing Co., copyright 1983, in Chapter 1 thereof, the purpose of a data base system is to carry out transactions. In this regard, a transaction is a unit of work. It consists of the execution of an application specified sequence of operations beginning with a special BEGIN TRANSACTION operation and ending with either a COMMIT operation or a ROLLBACK operation. COMMIT is used to signal successful termination of the unit of work, while ROLLBACK is used to signal unsuccessful termination of work because of some exceptional condition. In transaction-oriented systems, a transaction such as transferring money from one account to another is a single atomic operation. It either succeeds or fails. If it fails, then nothing should have changed; that is, the effect should be as if it were never initiated.
Transaction-oriented systems usually include a Recovery Manager. A Recovery Manager is a subsystem component specializing in maintaining the atomic nature of transactions and reestablishing system operation. In order to reestablish an information state of affairs to a 100 percent fidelity, logging of all events occurs. The total log consists of a currently active online portion on direct access, plus an arbitrary number of earlier portions in archival store.
There may be many events which cause a system to stop and thus require a subsequent system restart. While the contents of main memory and volatile buffers are lost, the data base on nonvolatile media is usually not damaged. Transactions that were in progress at the time of the failure must be rolled back since they were not completed. In order to identify which transactions to roll back, a search of the entire log from the beginning would have to be made. This would be manifest by noting those transactions having a BEGIN TRANSACTION record but no termination, such as a COMMIT or other primitive. To avoid this, prior art utilizes checkpointing. This means that the contents of volatile memory representing transactions in process are copied out to the active log. Indeed, information constituting the checkpoint itself is made of record and written to the log data set, and its address also duly noted in a RESTART file in nonvolatile storage. Each checkpoint record contains a list of all transactions active at the time of the checkpoint. Thus, at system restart time, the Recovery Manager can obtain the address of the most recent checkpoint record from the RESTART file, It then locates that checkpoint record in the log and proceeds to search forward through the log from that point to the end. As a result of this process, the Recovery Manager is able to determine both the transactions that need to be undone to effectuate ROLLBACK and the transactions that need to be redone to effectuate COMMIT in order to restore the data base to a correct state. To implement this, the Recovery Manager starts with two lists, an UNDO list and a REDO list. The UNDO list initially contains all transactions listed in the checkpoint record. In contrast, the REDO list is initially empty. The Recovery Manager searches forward through the log starting from the checkpoint record. If it finds a BEGIN TRANSACTION record for a given transaction, it adds that transaction to the UNDO list. If it finds a COMMIT record for a given transaction, it moves that transaction from the UNDO to the REDO list. When the Recovery Manager reaches the end of the log, the UNDO list and the REDO list identify, respectively, those transactions that must be undone and those which must be redone. Secondly, it goes forward through the log, redoing the transactions in the REDO list. Lastly, the Recovery Manager works backward through the log again, undoing the transactions in the UNDO list. No new work can be accepted by the system until this process is complete.
Writing a change to the data base and writing the log record representing that change are two distinct operations. There is a possibility of failure occurring in the interval between the two. To enhance safety, the log record is always written first. This is termed a "writeahead log protocol". That is, a transaction is not allowed to write a record to the physical data base until at least the UNDO portion of the corresponding log record has been written to the physical log, and a transaction is likewise not allowed to complete the COMMIT processing until both the REDO and UNDO portions of all log records for the transaction have been written to the physical log.
In a transaction-oriented data base system of the type described above, all changes to the data base are written to a log in support of recovery in the event of interruption. As mentioned, each transaction utilizes BEGIN, COMMIT, or ROLLBACK primitives in order to bound the transaction. In this regard, REDOs ensure transaction return to the most recent COMMIT point. In contrast, UNDOs ensure return to the transaction BEGIN point. Illustrative of transaction system log writing and utlization operations include:
(a) Gawlick et al, U.S. Pat. No. 4,507,751, "Method and Apparatus for Logging Journal Data Using a Log Write Ahead Data Set", issued Mar. 26, 1985;
(b) Paradine et al, U.S. Pat. No. 4,159,517, "Journal Back-up Storage Control for a Data Processing System", issued June 26, 1979; and
(c) Baker et al, U.S. Pat. No. 4,498,145, "Method and Apparatus for Assuring Atomicity of Multirow Update Operations in a Database System", issued Feb. 5, 1985.
The Gawlick, Paradine, and Baker patents respectively describe (a) the writing to log before record updating, (b) buffer dumping to a log, and (c) writing to a hard and soft log concurrently to assure multirow atomic updating. Significantly, the patented methods all relate to data movements, including those of loading which are paced or determined by the COMMIT points of transactions.
In transaction-oriented systems of the relational data base type, it is the LOAD software utility which moves sequential data sets to a relational tablespace.
In the event of an interruption, it is also the LOAD utility which must either restart from the beginning in order to avoid the overhead of log writing, or alternatively, restart from the last COMMIT point (i.e. The end of the last transaction) and incur said log writing.