This invention relates generally to the synchronization of computers in a multiply redundant processing system, and more particularly to methods and apparatus for frame synchronizing such computers.
Fault-tolerant processing systems are employed whenever critical processes are being controlled to ensure continued correct system operation in the presence of hardware or component failures. A common way of achieving fault-tolerant processing is to employ replicated processing systems operating in parallel on the same task and to vote on their outcomes so that failures can be masked. Such systems are referred to as multiply redundant processing systems, and typically employ three or more computer systems operating in parallel. In order for the voting to be meaningful, it is necessary that each processing system be operating on the same or similar data. If bounds are described for variations in the data, then identical data need not be used in all processors. Nevertheless, in order for the data to satisfy the prescribed bounds criteria, the processing systems must be synchronized in some fashion. If three processors, for example, are to read and subsequently operate on a changing input variable, they must all read the input variable at the same time in order to arrive at identical or at least similar results. The systems may be synchronized in either an instruction-synchronous manner or in a frame-synchronous manner.
Instruction synchronism is a widely used synchronization technique. One way of achieving such synchronization is to employ phase locked loops for causing the real time clocks of the processing systems to operate in lockstep synchronism. This poses problems during initialization and when dealing with asynchronous inputs. Moreover, lockstep synchronism schemes are often prey to single point failures, pose difficulties in detecting latent faults, and are difficult to extend to more than three or four replicated processing systems. In addition, there exist certain failure modes which can cause the entire system to fail. Frame synchronization techniques differ in that the processing systems are synchronized only periodically at some predetermined frame interval and are permitted to run asynchronously between synchronizations. Frame synchronization simplifies cold start initialization and the implementation of warm starts, which is necessary to enable systems to be taken off-line for maintenance and subsequently brought back on-line and synchronized with the other systems.
Known frame synchronization methods and apparatus have several disadvantages, included among which are their relative complexity and susceptibility to certain types of faults such as "stuck-at" type faults. It is desirable to provide frame synchronization methods and apparatus which avoid such difficulties, and it is to this end that the present invention is directed.