This invention is directed generally to the handling of errors or faults in a computer system operating under the control of a software program. More specifically, this invention addresses how different error recovery actions are selected especially when multiple concurrent faults are detected.
Error handling methods in a microprocessor environment normally utilize the interrupt vectors supported by the microprocessor. Upon receipt of an interrupt vector associated with detection of a fault, the microprocessor interrupts the currently executing program and executes an alternative program corresponding to the interrupt vector. Faults can be detected by known software and hardware detection techniques. In a typical error handling method, the type and source of the fault determine the error handling process to be executed by selecting the address associated with the desired error handling routine.
Error handling techniques are important in large complex systems which operate under software control since hardware devices and different programs are being concurrently utilized. Error handling and recovery techniques become critical in systems where uninterruptible service must be provided such as in a telecommunications switch environment. In a known recovery technique used in complex systems, error recovery routines and a selection routine that controls which recovery routine to execute have been combined into a sequential coded, integral, error handling system. Such a technique performs best when faults occur sequentially so that only a single fault has to be dealt with at a time. However, in complex systems concurrent faults occur and error handling priorities must be assigned to resolve the order in which the errors are addressed. When a change is needed in the order in which concurrent errors are to be handled, the priorities must be amended. The sequential coded, integral, error handling systems must be carefully reviewed and tested to insure that changes to the order of handling errors have not introduced an error condition in the error handling system itself. Unneeded error handling routines are often executed where the execution of one concurrent error is sufficient to eliminate a fault which gave rise to other concurrent errors.
There exists a need for an improved error handling system that allows changes to be made in the order of handling concurrent errors with a minimum of testing. A need also exists for an error handling system which allows an error to be ignored where the handling of another concurrent error is sufficient to address the fault associated with the ignored error.