The invention is based on a method and a storage device for saving the computer status during an interrupt. The invention also relates to a use of the memory management unit and of the virtual memory of a computer when saving the computer status.
In principle, three types of interrupt are encountered in computers, with increasing degrees of difficulty in treatment:
The classic program interrupt which is initiated with an interrupt request by an external event and which is serviced by the processor after the current instruction has been completed. The processor hardware ensures that the program counter and the status word are the first to be saved in the main memory so that the processor, after servicing the interrupt, can restore the preinterrupt status.
The instruction interrupt which occurs when it is impossible to finish processing the current instruction, for example in a system having a virtual memory, because of an ineffective access in the physical memory or, in a system with error detection, because of a bus error.
This instruction interrupt forces the processor to jump out of the instruction and to service a program which fetches the missing storage area from the mass memory or repeats the unsuccessful bus access. This is supposed to be followed by a continuation of the interrupted instruction.
Since the processor is interrupted in the middle of an instruction, it must save sufficient information from its internal status to be able to continue the interrupted instruction after the data are made available or after correction. This requires considerably more effort than the classic program interrupt. Such a facility is provided by the MC 68010 microprocessor described in the German journal Elektronik, Volume 22, 1983, pages 75-78.
A system failure is considered to be the most difficult case of an interrupt since the processor status can no longer be saved as in the above-mentioned cases after the failure has been discovered since it is uncertain whether the processor has survived the failure intact. Neither the duration of the failure nor its effects on the computer status are known. The contents of the main memory are considered to be suspect even if all error detection codes of the memory cells are correct since there is uncertainty concerning the actions carried out by the already damaged processor. In the known microprocessors, no preventative measures are taken for a failure.
One prior art approach to handling interrupts is described in an article by Kubiak et al under the title "PENELOPE: A RECOVERY MECHANISM FOR TRANSIENT HARDWARE FAILURE AND SOFTWARE ERRORS", Proceedings of the 12th International Conference on Fault Tolerant Computing FTCS-12, 1982, pages 127-132. In this approach, a device is inserted between a processor and its main memory which enables a previous operating state of both the memory and the processor to be restored in the case of a failure. This previous state is defined by a recovery point.
This device is connected to the processor bus and traces the bus accesses by means of which the processor changes the main memory status (write cycles). before each modification of a variable in the main memory, this device, called "save stack", addresses the variable concerned, reads the previous content and saves it in a stack.
At the beginning of a program section, that is at a recovery point, the save stack is empty. If a failure should occur, it should be possible to restore the status of the computer at this recovery point before the next recovery point is reached. After a failure is detected, the processor is reinitialized and the save stack is written back into the main memory in the reverse order. This gives all variables the value they had at the last recovery point when the save stack was empty, even if this value has in the meantime been modified several times.
The Kubiak et al publication is based on an earlier paper by P. A. Lee, N. Ghani, K. Heron, "A recovery cache for the PD "11", Proceedings of FTCS 9, Madison, 1979, pages 3-8. A cache memory is used by other and earlier inventors for the same method without touching upon the principle called "Saving before updating".
The "Saving before updating" method has three decisive disadvantages:
1. Protection only exists against processor malfunctions which have no ambiguous effects on the main memory, for example marginal or false control signals resulting in a modification of the address or of the data without this information being supplied to the save stack.
2. Errors or losses of the main memory itself cannot be corrected. After the failure, it is assumed that all storage shelves not addressed for writing since the last recovery point are actually still intact.
3. The save stack can be used only once for recovery. Should a failure occur during the writing back, the previous contents of the main memory are no longer recoverable.