Digital computer systems are used in a number of applications in which virtually continuous, error free operation is important to the operation of businesses or other entities using the systems. For example, in banking, computer systems are used to maintain account information and update account balances, and it is important for a bank to be able to provide accurate and up-to-date account information virtually instantaneously. Similarly, computers are used to monitor and control of airplane traffic, and around crowded airports and along major air corridors, it is vital that the computers be configured so that the air traffic control systems are continuously available. Computers are also used to control switching systems for the public telephone system, and it is similarly important that the computers be configured provision be made so that the telephone systems be continuously available.
It is generally possible to build computer systems which have extremely reliable components to accomplish tasks such as these and numerous others, and to provide preventive maintenance in such a way and with such frequency that failures are extremely improbable. However, such high-reliability computer systems would be extremely expensive to build and maintain. Accordingly, "fault-tolerant" computer systems have been developed, which is generally designed with the expectation that one or more element of the system may fail at some point in its operation, but that if an element does fail, other elements are available to detect the failure and ensure that the system will continue to give proper results. Such fault-tolerant computer systems will generally be much less expensive to build and maintain, since they may be constructed of components which individually are of lower reliability than those of high-reliability computer systems, and thus would cost less to build, and maintenance costs would also be lower. Fault-tolerant computer systems generally include redundant components which operate in parallel, and when a fault is detected in one element the other components are available to continue operation. A number of schemes may be used to detect a fault, such as fault detection circuitry which can detect certain types of faults. In addition, if a fault-tolerant system includes at least, for example, three processing components operating in parallel, the system can compare outputs of the three components and, if the outputs of two of the processing components agree but the output the third processing element differs from that of the other two, the system can with a high degree of confidence draw the inference that the one processing component is faulty and its output should be ignored and that the outputs from the two processing components which agree with each other are correct and should be used.
In order to ensure that the processing components of a fault-tolerant computer system are operating in parallel, so that their outputs can be properly compared, it is generally desirable that they all operate on the same program and data, and that they be synchronized in parallel by a common clock. Since a single clock may also fail, which would cause the entire system to stop operating, redundancy may also be provided in the clocking system, which is resolved at each of the elements which use the clock. However, certain types of failures in the clocking system can cause undesirable noise, which can also cause the processing components of the system to operate improperly.