This invention relates to a method and apparatus for detecting the absence of selected synchronism of two or more digital logic devices, when each is performing correct substantive operation.
The invention is useful in digital logic equipment and systems to detect when two or more operating elements, which normally operate with selected synchronism, lose that synchronism even though each is otherwise performing correct logic operation. The performance of correct logic operation by a digital logic device, even when with a loss of selected synchronism relative to another digital logic device, is herein termed correct substantive operation.
One application of the invention is in a fault-tolerant computer system where, for example, two central processor units normally operate identically and in lock step synchronism. If either one processor unit fails, the operation of the other continues, and keeps the system operating without interruption.
It is known to check the operation of each such CPU device in a system of this kind, and to disable one if it fails. U.S. Pat. No. 4,453,215 describes a digital logic system of this kind, and Stratus Computer of Marlboro, Mass., USA manufactures such fault tolerant computer equipment.
Further teachings regarding digital computer systems that employ redundant structure for increasing reliability include U.S. Pat. No. 4,428,044 of Liron and U.S. Pat. No. 4,228,496 of Katzman. See also, Rennels "Architecture for Fault-Tolerant Spacecraft Computers," Proceedings of the I.E.E.C., Vol. 66, No. 10, pp. 1255-1268 (1975).
In addition to a failure in logic operations, two or more digital processor or other logic devices, which normally operate with a common clock for a designated synchronism, are subject to loss of the designated synchronism even while continuing correct logic operation. This fault condition can go undetected, so that each device continues to operate, even though it is producing output information out of step with the other. The system then is likely to product faulty data, and eventually to fail completely.
The loss of synchronism between the two processor devices can occur, for example, due to spurious signals that actuate one device but not the other. Two nominally indentical devices also can fail because they in fact operate with different speeds due, for example, to design flaws and to variations in the components they employ.
Hence, prior fault-tolerant computer systems have been subject to failure by loss of specifyied synchronism, even when all devices are performing substantively correct operation.
It accordingly is an object of this invention to provide a method and apparatus for providing an improved level of fault tolerance in digital logic systems.
A particular object is to provide a method and apparatus for detecting a loss of prescribed synchronism in the operation of two or more digital logic devices, even when each is providing substantively correct operation.
Other objects of the invention will in part be obvious and will in part be set forth herein after.