1. Field of the Invention
This invention relates to a method and apparatus for increasing the mean time between failure for computer systems. In particular, a technique for allowing one signal line to be used as a spare for many signal lines is disclosed.
2. Background Information
Spare signal lines are often used to improve the availability of systems (computers, signal processors, etc.) in both military and commercial applications. Some military avionic designs use complete replication of critical signal lines. This approach, however, while minimizing fault isolation requirements, causes severe connector pin and backplane wiring problems as the number of signal lines are increased to provide adequate bandwidth for new high performance processors. For example, a 16 bit bus having 29 signal lines requires a total of 58 lines when a full complement of spares is used. A 32 bit bus requires at least 58 signal lines for a total of 116 lines when using this same approach. Unfortunately, in most applications there is not the space available to provide the luxury of full replication of signal lines.
An alternate approach provides one spare line for a relatively large selected set of signal lines. For example, one spare line can be provided for 39 lines between a memory and a processor. If one of the 39 signal lines fails, the bit normally assigned to the failed line is switched to the spare line, permitting normal operation to resume. The primary difficulty when using one spare line to effectively replace any one of several failed lines is the performance penalty for re-routing the signal from the failed line. FIG. 1 illustrates the prior art single spare line switching technique applied to 16 signal lines (Lines 1-16). A failed signal line is bypassed by selecting the respective input signal (Sig A-Sig P) and passing it over the spare signal line 20.
Four of the sixteen signal lines are fed into each of four multiplexers 22, 24, 26, 28. Select signal S0 and S1, from a register (not shown) connected to error detection logic, control which, if any, one of the four signals is outputted for each of the four multiplexers 22, 24, 26, 28 when a line failure is detected. The outputs from each of these multiplexers is inputted to another 4:1 multiplexer 30. Select signals S2 and S3 from the register determine which one of the four inputs is fed through to the spare line 20.
At the output of each of the lines 1-16 is a 2:1 multiplexer 32 (only 2 of the 16 are shown). Select signals Sel 0-Sel 15 provided by a 4:16 decoder (not shown) switch the appropriate multiplexer to permit the signal on the spare line 20 to be outputted in place of the failed signal line.
As the ratio of signal lines to spare lines increases, the number of levels of multiplexers required to route data intended for the failed line to the spare line increases. The additional delay imposed by multiple levels of multiplexers becomes a part of the critical path and must be accommodated in the machine cycle time. With 4:1 multiplexers., three levels are required to provide spares for a 58 line bus. In typical VLSI implementations, the multiplexer tree creates wire blockages and long wire lengths that add to the propagation delay. Furthermore, the spare line must be distributed across all the signal lines on the receive side, creating a long line with extra delays.
It is therefore desirable to provide a cost effective technique for replacing the failed line with reduced impact to the interface critical paths and less wiring congestion than prior approaches.