In order to enhance the reliability and availability of a computer system, including network systems, it is desirable to invoke a spare component dynamically while a system is still running. The dynamic sparing feature for IBM's current computer systems is increasingly more important in a computer system designed to satisfy customers' demands of zero down time in a fault tolerance design having minimal service interruption. A self-healing system is desirable.
In the enduring prior art, the one currently still used by IBM is the technique long known as hardware
Triple-Modular-Redunancy(TMR)/Sparing which was based on a voting result occurring after recognizing and locating the failure of an active logic module and then reconfiguring the system by invoking a sparing action using a combination of a current masking-type error detection with standby redundancy type correction techniques. This technique was described in the original IBM U.S. Pat. No. 3,665,173 issued May 23, 1972 entitled “Triple Modular Redundancy/Sparing” invented by Willard Bouricius, William Carter, John Roth and Peter Schneider of IBM, which is incorporated herein by reference.