This invention relates to a tightly synchronized fault tolerant clock.
It is desirable that airborne electronic (avionic) equipment should have a very high level of reliability. This may be achieved by use of fault tolerant architecture. One technique for achieving fault tolerance is redundancy. Redundancy at component level, i.e. interdependent multiple clock channels within a circuit, rather than at system level is needed to obtain the required reliability. In a fault tolerant computing system that comprises multiple processors operating in lockstep and uses redundant clocks, the clocks must be synchronized in order for the computing system to be able to effectively compare data and mask out faults. A fault tolerant clock must be extremely reliable to meet the reliability requirement for its host fault tolerant computer. To maintain the synchronization and reliability of the clock it is important that the design be simple and require a minimal number of components.
Typically, to tolerate a single fault, a fault tolerant clock has three or more clock channels each comprising an oscillator having a feedback path that contains a majority voter. The majority voter receives the outputs of all channels and provides a clock output signal that reflects the state of the majority of the channel outputs. A particular troublesome failure in a fault tolerant clock of this type is the malicious or byzantine fault which occurs when one channel is faulty in such a way that it broadcasts different signals to or is perceived differently by the other channels thus causing inconsistency in voter outputs. In a loosely synchronized system, a substantial amount of additional circuitry is needed to eliminate this special fault. One method is quadrature mode redundancy, employing four channels with outputs from three channels feeding back to the voter of the fourth channel. This increase in circuitry and complexity can reduce the overall reliability of the system.
In general, the prior art in fault tolerant clocking systems exhibits the properties of being loosely synchronized to approximately 25 to 50% of the clock period and only operational at frequencies that are currently considered relatively low, e.g. less than 10 MHz. Such systems also utilize extensive hardware and in some cases involve software. The system taught in U.S. Pat. No. 4,644,498 employs a triple modular redundant architecture. Each channel of this system utilizes a majority voter network in the feed-back loop to establish and maintain the synchronization of the oscillator. As with most systems employing a majority voter network to both establish and maintain synchronism, the result is a loosely synchronized system. As a result of this loose synchronization, the system will have erroneous outputs for certain fault inputs, i.e. duty cycle changes or glitches. Also, a bridging fault to two or more inputs of any voter will fail the whole system. The operating frequency of this system is only 5 MHz.
In U.S. Pat. No. 4,239,982 a fault tolerant clocking system is taught that requires 2M+2 channels to tolerate faults in M channels. This system also employs complex phase locking circuitry, which leads to low reliability.
U.S. Pat. No. 3,769,607 shows a switched oscillator clock pulse generator. This system uses an extensive amount of hardware in its master selector circuit for determining the master oscillator. A substantial amount of hardware in the error circuitry lies between the oscillator and output pulse. This downstream circuitry reduces the synchronization between the individual channels. This system also utilizes a microprocessor for adjusting the algorithm that controls the error indicators in the flip flops and for selecting the master oscillator. A single point failure may occur when the master clock itself is in error.
U.S. Pat. 3,278,852 shows a redundant clock pulse source utilizing an architecture in which switching and decision making elements are alternated. This architecture also relies on a majority voter to establish and maintain synchronization. This results in loop synchronization which may cause errors for some inputs. For example, if any two inputs of any decision element are shorted together, all outputs will be faulty. This is an example of a single point failure. This system cannot tolerate a malicious fault.
U.S. Pat. 3,599,111 shows a system comprising three oscillator circuits driven by a common sync-pulse generating oscillator. Although a backup oscillator is provided, this single oscillator provides a common point failure. In addition, when the first oscillator fails, a transient due to the switching from the oscillator to the back-up oscillator occurs. There is no feedback to the crystal oscillator and synchronization is provided by a majority voter network resulting in loose synchronization.
U.S. Pat. 3,900,741 teaches a fault tolerant clock apparatus utilizing a controlled minority of clock elements. This system requires 3M+1 channels to tolerate M faults. No phase locking circuitry exists, therefore limiting the system to use at relatively low frequencies. A complicated quorum logic network is used instead of a simple majority voter network.
The current state of technology in fault tolerant clocking is a system that utilizes extensive hardware, only provides loose synchronization, is not effectively operable at high frequencies and does not tolerate multiple failures within a single chip, considered herein as a single point failure. The component count and especially the number of components "downstream" from the output of the oscillator is very important in maintaining tight synchronization between the three channels. When dependent on a majority voter for synchronization, such a system provides only loose synchronization, which renders the system sensitive to glitches and malicious faults. Such a conventional system also does not tolerate multi-pin failures within a single channel, e.g. if all of the inputs to a voter were shorted to ground.