1. Field of the Invention
This invention relates generally to a fault-tolerant multiprocessor arrangement and more particularly to a fault-tolerant multiprocessor with a memory for storing the previous data state.
2. Discussion of Background
Transient or permanent hardware and software faults in a fault-tolerant multiprocessor arrangement must be detected and eliminated as quickly as possible. For the purpose of fault detection, two processors forming one data processor are frequently operated in parallel with the same program and their results are monitored for correspondence. As soon as a fault occurs, program execution is interrupted (fail-stop function). To ensure that the program is executed even after a fault, particularly in the case of process computer applications, a back-up task is activated on a hot-standby processor which is provided with the same I/O channels.
Since it is a very complex task to guarantee the integrity of each individual (atomistic) operation during the activation of the standby processor and since more time is frequently required for fault detection than for the execution of an individual operation, execution of the program is usually resumed by the standby processor at one of several earlier points specially provided for in the program (recovery point). This point has been reached without faults by the processor originally executing the program (rollback technique). By resuming at this point, some of the operations will be repeated.
Execution of the program is preferably resumed at the recovery point last reached without faults. However, this is only possible if the time of occurrence of the fault and the time of its detection lie within the same interval between the same recovery points, for example recovery points RP.sub.i and RP.sub.i+1. Otherwise, execution of the program must be resumed at the recovery point RP.sub.i-1 or an even earlier recovery point (multiple-step rollback technique).
To resume program execution at a recovery point, the standby processor needs, in addition to the program, a copy of the data state which existed at this recovery point in the main memory of the processor originally executing the program. The program, which does not change with time can be made available to the standby processor at program start up. In contrast, copies must be provided of the continuously changing data state at each recovery must be stored in a manner accessible to the standby processor at program start up the respectively next recovery point is reached
The speed with which program execution is resumed by the standby processor after an error and with which the state of program execution already achieved when the error occurred is achieved again obviously depends on the mechanism of creation of these copies, the mechanism by which these copies are accessed by the standby processor and the amount of code between the individual recovery points in the program.
The data state copies are created in a so-called state save unit (SSU). This unit is arranged to be electrically isolated and spatially separated from the processor executing the program so that it will not also be affected by an error in this processor. As has already been mentioned, the standby processor should be able to rapidly access the data state save copies which is why the state save unit is usually arranged close to the standby processor.
The simplest way of creating the data state copies at the recovery points consists of the processor executing the program transferring the total contents of the memory allocated to it into the state save unit after each recovery point has been reached. However, program execution must be interrupted during the time required for this operation.
To save time, the amount of data to be transferred into the state save unit can be reduced by transferring only the changes D.sub.i, i+1, produced by the write accesses in the interval which has just elapsed, for example between recovery points RP.sub.i and RP.sub.i+1, to the state save unit. This is possible because the data state S.sub.i+1 at the recovery point RP.sub.i+1 differs from the preceding data state S.sub.i at recovery point RP.sub.i only by the changes D.sub.i, i+1 : EQU S.sub.i+1 =S.sub.i +D.sub.i , i+1.
If an error occurs during the process of transferring, the data or changes D.sub.i, i+1 into the state save unit using this procedure, the problem is created that the no longer current "old" data state S.sub.i is already partially overwritten by the data or changes of the more current "new" data state S.sub.i+1 but the "new" data state Si+1 has not yet been completely recorded. There is then no valid data state available in the state save unit.
One possibility of avoiding this problem consists in duplicating the state save unit. Such a duplication is already known from Ferridun, A. M.; Shien, K. G., A fault tolerant multiprocessor with rollback recovery capability, Proc. 2nd Intern. Conf. on Distr. Comp. Systems, pages 283-289, 4/81.
While one half of the duplicated state save unit is in each case available for receiving the new data state, for example S.sub.i+1, the "old" data state Si is stored in the other half. The function of the two halves of the duplicated state save unit (updating, storing) alternates at each recovery point.
Since the saved data state, for example S.sub.i, in one half of the duplicated state save unit is not influenced by filing the new data state S.sub.i+1 in its other half, filing of the new data state S.sub.i+1 can take place during program execution in the interval between recovery points RP.sub.i and RP.sub.i+1 so that, as a rule, no further interruption of program execution is required.
To file the data state S.sub.i+1, it is possible to copy the data state S.sub.i in the :state save unit into its respective other half in order to be updated and at the same time to record the current changes D.sub.i, i+1 in this half. On the other hand, it is also possible to transfer, instead of the entire data state S.sub.i, only the changes D.sub.i-1, i carried out in the preceding interval between recovery points RP.sub.i-1 and RP.sub.i in the half containing data state S.sub.i in the current interval into the half to be updated in the state save unit because it holds true that: EQU S.sub.i+1 =S.sub.i-1 +D.sub.i-1, i +D.sub.i, i+1.
For this purpose, however, the state save unit memory words or addresses which have been modified in the preceding interval must be flagged. This can be done by allocating a separate bit to each memory word of the state save unit. Another bit can be used in a known manner for identifying the memory words of the state save unit half to be updated in which current changes D.sub.i, i+1 have already been made in the interval current in each case. This allows the transfer of changes D.sub.i-1, i in the state save unit and the recording of current changes D.sub.i, i+1 to be nested together since overwriting of the current changes D.sub.i, i+1 with changes D.sub.i, i-1 from the preceding interval can be avoided by checking the other bit. The two bits mentioned change meaning at every recovery point.
The method described can be used for creating the save copies in the state save unit without any effect on the running of the program if the transfer of all changes D.sub.i-1, i from the preceding interval in the state save unit can be concluded before the next recovery point in each case is reached in program execution. However, this means that the minimum distance between two recovery points is determined by the transfer time.
However, the recovery points cannot easily be provided at arbitrary points in the program and at arbitrary distances from one another. Problems which would arise, for example, during resumption of program execution by the standby processor due to the repetition of output operations to the peripherals or the repetition of inter-process communication can be avoided only by providing recovery points in each case immediately following such operations.
However, this requirement establishes an upper limit for the mutual distance between recovery points. The mutual distance between recovery points is therefore primarily determined by the intensity of the I/O operations required. In particular applications, it an be shorter than the time require for data transfer in the state save unit.