This invention relates to devices and methods for stopping a processor system clock in response to errors detected in processor system units. More particularly, the invention concerns the selective stopping of the clocks of a processor partition consisting of a group of units, the integrity of whose operations can be affected by an error occurring in one of the units.
In the prior art, processors having a modular construction represented by FIG. 1, include a backplane board 10 which carries connector mechanisms (not shown) into which are inserted logic cards 12-1 through 12-n. The logic cards are interconnected by backplane wiring on the backplane board 10.
Functional layout of the circuit boards of FIG. 1 is indicated by reference numerals 12-1 and 12-2 of FIG. 2 which correspond to the identically-numbered boards of FIG. 1, and which are representative, generally of the remaining boards in FIG. 1. The circuit boards of FIG. 2 include unit modules, M.sub.1, M.sub.2, and M.sub.3 on circuit board 12-1, and modules M.sub.1 ', M.sub.2 ', and M.sub.3 ' on the board 12-2. The modules on the board 12-1 are not necessarily equivalent in function to those on the board 12-2; for example 12-1 may have its modules arranged and connected to form a scalar processing configuration, while those of the board 12-2 may be arranged to form a vector processor. The modules on the boards 12-1 and 12-2 are operable only when provided with clock signals from clock generators (CG) 14-1 AND 14-2 respectively. Each clock generator derives a local clock from a system oscillator signal produced by the clock oscillator 25 and distributed on signal lines 25-1 and 25-2. The generated clock signals are provided on signal lines 16-1 and 16-2, respectively. Without provision of clock signals, the modules are inoperable. For detecting module operation errors, error indicators 18-1 and 18-2 are connected to the modules of the boards 12-1 and 12-2, respectively. The error indicators 18-1 and 18-2 are conventional in all respects, and are used to collect and forward error indications from their respective connected modules. The indicators 18-1 and 18-2 each respond to error indications from their respective connected modules by providing two error signals, one to indicate the presence of an error (IND) on signals lines 20-1 and 20-2, respectively, and the other a STOP signal on signal lines 22-1 and 22-2 to a gate circuit corresponding to the NOR gate 23. Any time an error indication is received from one of the board modules, its associated error indicator raises its STOP signal, which deactivates the output of the NOR gate. The NOR gate output corresponds to the CLOCK GATE signal which is provided to each of the clock generators 14-1 and 14-2. The deactivated CLOCK GATE signal prevents the clock generators from generating and forwarding clock signals to their associated modules. Removal of clock signals from the modules prevents them from operating while a support processor 20 executes an error checking and correcting process in response to the error indication signal received on line 20-1 or 20-2. Assuming successful completion of the error correction process, the support processor will reset the STOP and IND signals, causing the CLOCK GATE to activate. Resultantly, the clock generators 14-1 and 14-2 are once again enabled to provide clock signals to operate the modules.
The clock stopping apparatus and procedure of FIG. 2 is implemented for all of the cards in the processor of FIG. 1. Thus, whenever an error is detected on one of the cards, the clocks on all of the cards are seized up until the error correcting procedure of the support processor 20 is completed. It will be appreciated that, while an error occurring in the functional operation of one card may propagate to, or cause errors in, other cards, it is not always of necessity true that such an error will propagate to all of the other cards. Significant portions of the processor may remain operable even while other portions are functioning erroneously. Further, an error in the operation of one card may affect as few as one other card. For example, an error in a vector processor execution may affect the operation only of an associated scalar processor. However, failure of a memory card may affect a great number of other cards which conduct processes requiring access to the memory card.
A conditioned response to the detection of error in a processor with the modular construction of FIG. 1 would support the continued, although degraded, operation of the processor in the face of detection of error in one of its cards. An appropriate response would permit the interruption of operation only of an error-producing unit, as well as other units whose operations are affected by this unit, without interrupting the operations of other, unaffected units.