Digital fault tolerance provisions are needed because of the serious consequences of system failure. The common way of achieving such fault tolerance is through redundancy of systems or subsystems such as computers and sensors which must have synchronized time bases. However, it is essential that the synchronization of the time bases be fault tolerant. Also the method or device implementing fault tolerant synchronization should incorporate latent fault detection and reporting so that maintenance and troubleshooting may facilitate the prevention of system failures.
In the case of a parallel set of redundant subsystems, such as computers, it is important that neither strays too far ahead or behind the other in its processing tasks. Although synchronization is required among the systems, such synchronization must be fault tolerant, and must include latent fault detection and reporting. Further the series connections of redundant components (for instance, redundant sensors in series with redundant processors) require synchronization in order to communicate information.
A variety of fault tolerant synchronization schemes exist. Some involve computations which are suitable for synchronizing low frequency clocks whose period is very long relative to the time required for the computations, so that the skew induced by variations in the synchronization mechanism itself is acceptable. Such schemes may be implemented in software; however, such an approach may be incompatible with the architecture of some digital systems wherein computers may be present but dedicated to various tasks. In such situations, the addition of an entire general purpose computer or computers to perform the synchronization task may be very inefficient in comparison to the utilization of dedicated clock circuitry.
There are approaches which provide fault-tolerant clocks with minimal dedicated circuitry. Most of these designs are only one-fail operable, with the exception of one which provides an algorithm (not a circuit) that achieves N-fail operability with 3N+1 clock modules.
No design is known that includes latent fault detection. Without latent fault detection, a fault tolerant clock will continue to operate in the presence of one or more latent faults, but eventually will be vulnerable to failure with a single added fault. That is, the clock eventually is not fault-tolerant any more, and such situation is not detected. This is unacceptable for a fault tolerant clock.