1. Field of the Invention
The invention is related to Multiple Computer Systems, and in particular to Fault-Tolerant Multiple Computer Systems not having multiple Computers performing each system function.
2. Prior Art
The earliest attempts to produce Fault-Tolerant Control Systems provided redundant computers in which each computer simultaneously executed every task required for the control operation. Voting circuits monitoring the outputs of the multiple computers determined the "correct" system output, the "correct" system output being the output produced by the majority of computers. When a faulty computer produces an output which differs from the "voted" output, the differing output is discarded and does not affect the "voted" or "correct" output of the control system. In this type of Fault-Tolerant System, the failure of a computer may or may not be detected and that computer may or may not be turned "off".
This method, though highly successful, is expensive since it requires multiple equivalent computers, each simultaneously performing the same function. These systems require relatively powerful computers, since each computer has to perform every task required for the operation of the system.
As an alternative, a master-slave concept was introduced in which the operation of several computers was coordinated through a master control. The master designated which tasks were to be executed by the individual computers. This reduced the execution time of the control operation since the good computers no longer were required to execute each and every task. When a fault was detected in the operation of one of the computers, that computer was disconnected and the master distributed the tasks among the good or operative computers. The master-slave concept is dependent upon the continued operation of the master and if the master failed, the system failed. This situation may be rectified by using redundant masters, however, the increased cost of redundant masters limit the applicability of these types of systems to situations where the user is willing to pay for the added reliability, such as in space exploration, nuclear energy facilities, or any other situation where failure of the system would endanger lives.
Recent efforts to improve upon master-slave and redundant execution Fault-Tolerant Multiple Computer Systems are exemplified in the October, 1978 Proceedings of the IEEE, Volume 66, No. 10, which is dedicated to fault-tolerant control systems. Of particular interest are the papers entitled "Pluribus: An Operational Fault-Tolerant Multiprocessor" by D. Katsuki et al., pp. 1146-1159 and "SIFT: The Design and Analysis of A Fault Tolerant Computer for Aircraft Control" by J. H. Wensley et al., pp. 1240-1255. The Pluribus and SIFT control systems are believed to represent the present state of the art. The SIFT system uses redundant execution of each system task, and of the master control functions. The Pluribus system has a single "master" copy of most current information, which can be lost when a fault occurs. such loss of current information can cause interruption of system operation for several seconds or minutes.