This invention relates to a control method for a distributed data base and, particularly, to a distributed data base system of the composite subsystem type suitable for a joint operation with another plurality of data bases, and also to a method of fault recovery for a composite subsystem type online system.
In the conventional recovery control method for a distributed data base, as described in the proceeding of the 33rd (latter term of 1986) annual convention of Information Processing Society of Japan, pp. 907-908, a system down condition at a distributed site is not considered to be a system down condition of other sites, and a system down condition of a slave site is detected and isolated by a master site and a system down condition of the master site is detected by a slave site and the abnormality of the master site is indicated by the slave site to other slave sites. In consideration of resumption of operation, the process of updating the data base uses a 2-phase protocol which is described in the publication "Principles of Data Base System", 1980, pp. 340-356, by Jeffrey D. Ullman, Computer Science Press. This allows the recovery without the occurrence of an inconsistency in the data even if a distributed site has gone down during the updating process in synchronism with data bases of other sites.
However, a down condition in the distributed data base access section within a distributed data base is not separated from a down condition of a local data base, and therefore a down condition in one site results in a down condition for both distributed access and local access.
The above-mentioned prior art does not treat separately the distributed data base access section within a distributed site and the local data base access section within a site, and therefore a down condition in a site always results in a distributed data base access down condition and a local data base access down condition, which creates a reliability problem.
In the conventional fault recovery method for an independent online system, as described for example in JP-A-54-114145, the system has an audit file (journal file) and a check point file (and a before look file in some cases) to sample journal and check point information in preparation for faults,, and at the time of occurrence of a fault, the system attempts recovery using the journal and check point information corresponding to the fault. The online system to which the recovery method is applicable is confined to one having one data communication section and one data base section.
The above-mentioned prior art does not take into consideration fault recovery for a composite subsystem type online system, and in such system configuration each subsystem needs to have its own journal and to try fault recovery independently. However, when a job process (transaction) across several subsystems arises, a recovery process in synchronism with each other subsystem cannot take place, despite the need for synchronous information for recovery among the subsystems. The recovery of a transaction may be tried in synchronism after the faulty subsystem has started up, however, in case a subsystem does not start up promptly after occurrence of a fault, the remaining subsystems will have a transaction which is left unrecovered. Therefore, journal information necessary for the recovery of the transaction needs to exit continuously. If the online operation is resumed in this situation, the journal necessary for the recovery of the faulty transaction is buried in journals produced later, and the system is compelled to look for the journal information buried in the mass of journals after the faulty subsystem has recovered. On this account, at the time of occurrence of a fault in one subsystem, it is necessary to halt all subsystems and, after starting up all the subsystems, recover all transactions before resuming the online operation.
To cope with this problem, when journals of all subsystems are unified so that synchronous information for information updating is useful even if some subsystems do not start, information for other than the faulty subsystem can be recovered, however, in this case the journal needed by the faulty subsystem is buried in the unified journal, resulting also in a problem of looking for a necessary journal in the mass of journals.
Furthermore, in the conventional online system of the composite subsystem type, each subsystem individually controls the state of access to the data base controlled by it. In case one transaction has updated data in a plurality of data base systems, the 2-phase committing method is used to guarantee the consistency of updating of the data base systems by the transaction. The 2-phase committing method is described in the publication "Principles of Database Systems", pp. 351-356, by Jeffrey D. Ullman, COMPUTER SCIENCE PRESS, 1980.
In the 2-phase committing method, a subsystem which has received a transaction reports the commencement of the transaction to all data bases prior to the access to the data base of the transaction. Upon receiving the report, each data base appends an identifier to the transaction for distinction among transactions in its own system, and returns it to the data communication system. In terminating the transaction, the data communication system specifies the identifier of each transaction and directs each data base to have a commit preparation for the transaction, as a first phase. The commit preparation is a preprocessing to guarantee the completion of the transaction, and it generally corresponds to the journal output process.
After receiving the commit preparation end reports from all directed data bases, the subsystem issues a commit instruction. If even a single data base has failed in commit preparation, the subsystem indicates the failure of the transaction to all data bases. Upon receiving the failure of the transaction, each data base reads the journal of the transaction produced in the commit preparation process, thereby implementing the data base restoration process.
In case one of data bases in the online system has failed, the whole system is brought to an abnormal termination so as to suspend all transactions in execution. After that, fault recovery processes for all transactions which have been in execution are carried out for each subsystem based on the journal.
In case a transaction in execution continues to be processed without suspension at the time of occurrence of a fault, the commit preparation will fail at the end of the transaction in execution, and the restoration process for the transaction will be carried out by all data base systems accessed by the transaction.
In the above-mentioned prior art, if a data base in an online system fails, it is not possible to find the transaction which has accessed the data base, and therefore all transactions in execution are subjected to fault recovery by bringing the whole system to an abnormal termination.
However, viewing the realistic job affair, even in such an online system including a plurality of data bases as mentioned above, a transaction in most cases makes access only to one data base, and there is little proportion of transactions making access to a plurality of data bases.
In the conventional method, when a data base in an online system has failed and not only the transaction in access to the failing data base but all transactions in execution are involved in the subject fault, even transactions which can proceed to normal processings are subjected to fault recovery. Namely, it is not intended positively to minimize the range of influence of a fault, and this is a problem in operating the system.
In the case of a method which allows transactions in execution to continue to be processed and detects a transaction in need of fault recovery at the end of each transaction, even the transaction which already made access to the faulty subsystem at the occurrence of the fault goes on processing, and it will become necessary for the transaction, if updating continuously for other data base systems other than the faulty data base system, to have a great deal of restoration processes at the end of the transaction.
In the conventional check point acquisition process which is necessary for the fault recovery process, the process enters the wait state at the time of the check point until all transactions in execution are complete, as described for example in JP-A-58-2936. This is because transactions in execution are making access to the table which is the subject of the check point dump, and therefore if the acquisition of the check point dump is started during the execution of the transaction, it will be, in case of journal acquisition before the time of the check point and table updating after the time of the check point, that the journal before the time of the check point is necessary at the time recovery of the table.
The above-mentioned prior art does not consider a transaction which is in execution in a faulty subsystem and a transaction which does not terminate for a long period, such as the case of a transaction in execution in another host machine in a distributed data base, as has been experienced in the advent of composite subsystem type online systems, and the check point acquisition and validation cannot take place when a long term transaction exists, resulting in an abnormally long check point interval, which imposes not only a long fault recovery time, but the need of maintaining all external storages which contain journals in the accessible condition at the time of recovery.