This invention relates generally to error processing in fault tolerant computing systems, and more specifically to methods and apparatus for processing errors in such systems which have dual processors operating in synchronism.
Processing errors in a data processing system involves three steps. The first is the detection of the error. The second is the recovery from the error. The third is recording information about the error. Another concern in a redundant processing environment is returning the system to full redundancy after repair.
In a fault tolerant computing system, errors are more costly than in a standard non-fault tolerant computing system. This is because fault tolerant computer systems are always employed in environments where the cost of any downtime is high either in terms of money or safety. Therefore, error processing is an extremely important operation for such systems.
As important as error processing is to a fault tolerant system, it is desirable that such processing not delay the execution of normal data processing operations unnecessarily. Thus a balance must be struck between the desire for efficient processing and the need for effective error processing.
Certain conventional fault tolerant computer systems suspend all operations upon the detection of an error in order to execute software error recovery procedures. Software error recovery procedures, however, can be complex and involved. Usually such procedures take considerable time and force the system to interrupt a potentially crucial task.
Accordingly, it is desirable to construct a system which spends a minimal amount of time executing software operations necessary for error handling. In this manner, the effects of software error handling on a computer system can be reduced by minimizing the time spent on error processing.
It is also desirable for the present invention to handle as many errors as possible in hardware so that error recovery is transparent to the software processes for which the computer system is executing data processing instructions.