This invention relates to a method and system for providing a fault tolerant multiprocessor computer system.
There are computer system applications where it is important that data processing not be interrupted. Examples of such applications are the financial industry, critical industrial facilities such as nuclear plants and, in general, those situations where failure of the computer system will cause serious disruption.
Fault tolerant computer systems have been built with varying degrees of redundancy which provide duplicate systems or system components so that data processing can continue in the event of some failure. Several approaches to achieving a fault tolerant computer system may be used. In one approach, multiple multiprocessors, each with its own memory conduct independent tasks. In another approach, multiple multiprocessors share a common memory and conduct independent tasks. Another approach is to use two or more microprocessors each having its own memory and conducting identical tasks in unison. Yet another approach would be the use of two or more multiprocessors sharing a common memory and conducting identical tasks in unison.
Fault tolerant computer systems using a mixture of the four approaches are also possible. In one prior art fault tolerant system, four central processing units are used, with two operating in duplex fashion on a first board and two operating in duplex fashion on a second board. A comparator is used to detect when the outputs from each board are equal and unequal. If the comparator detects an inequality, the defective board is automatically shut down and the output is thereafter provided by the other board.
Prior art fault tolerant computer systems, however, while offering various degrees of fault tolerance, do not meet the objectives or provide the advantages resulting from the present invention.