The present invention generally relates to the synchronized operation of circuits. More particularly, the invention is directed to the detection of synchronization errors in lock step operated circuits having multiple defined states.
Electronic systems, and in particular computer systems, have evolved to the level where multiple processors operate concurrently on wide and high speed data and address paths. Such systems routinely require that multiple circuits, typically residing on individual integrated circuit chips, maintain a lock step level of synchronized operation through multiple states in clocked succession. Though such systems are designed to maintain synchronization, it is difficult to detect when the circuits diverge. The loss in the lock step ordered operation may be a consequence of device defects, instruction or data errors, or noise induced aberrations in the signals being transmitted within the circuits.
Contemporary wide data path designs frequently partition the data flow into multiple parallel paths using integrated circuit components with matching circuits operable in lock step synchronism with one another. The control structure which coordinates the activities occurring within each of the individual circuits is partitioned analogous to the data flow. In that context, each control partition is given equivalent information for sequencing through the lock step defined states in clocked synchronism with the others. Typically, the control partitions respond in lock step synchronism signals by executing locally defined finite state machine steps. If all the circuits operate properly, each circuit with its associated finite state machine control partition sequences through the same or analogous states as the peer circuits. Though this distributed form of processing data flow provides a means by which the computer system processor and data path count may be increased with relative ease, the architecture is particularly vulnerable to partition errors.
If an error occurs either in the transmission of sequence related signals or as a consequence of a circuit defect, then the control state machine of one circuit may be out of sequence with the peer circuits. Reliable system operation requires that such errors be detected, that the detection be accomplished quickly, and, if feasible, that the source of the error be identified.
The conventional approach to validating the synchronization of multiple states in lock step operated circuits involves the use of a comparator which receives control state signals from the various circuits, compares the states, and identifies mismatches in the states of the various circuits. To be effective in comparing the multiple states in each of the multiple circuits, such comparator require separate lines from each of the circuits with each set being numerically adequate to represent the multiple states of each circuit. Where the circuits reside on individual chips, such lines consume chip input/output pins, a dear commodity for such devices.
If line/pin count is a major concern, it is possible to accomplish the function in a serial format using a scanning or polling scheme. In such practice the comparator information is obtained by scanning or polling each of the circuits individually as to the status of their multiple states. Unfortunately, this approach results in the detection of errors as defined by a comparator output on an average significantly after the state mismatches occur.
What is needed is a system and method which timely detects errors in multiple control partition states of lock step operated circuits without requiring a large bus from each of the circuits to a common comparator.