One of the key elements in a vector processing facility is the storage array chip which make up the vector registers. These can have relatively high failure rates so it is desirable to recover from an error in one of these chips. Previous schemes used simple parity which does not permit recovery, or traditional error correcting codes (ECC). Although ECC does allow recovery after an error, it is relatively difficult to implement, requires a significant amount of logic, and tends to impact the overall design of the vector processing facility.
The method described in the present application overcomes the above stated difficulties by taking advantage of the fact that the current array chip technology provides much denser array chips and that this extra density may be used to provide a redundant copy of all data stored in the vector registers so that error recovery is possible based on this redundant data. The method described herein may be used to recover from both transient and most types of permanent errors in array chips.
U.S. Pat. No. 4,326,291 of Marsh et al. discloses an error detection system in which a redundant logic unit is provided along with a required logic unit for simultaneous operation therewith. The required logic unit and redundant unit both produce output data which, it is desired, will be the same. The output data from the required logic unit is supplied to a data bus and the output of the redundant logic unit is supplied to a parity check digit generator. From the data received from the redundant logic unit the parity check digit generator generates a parity check digit which is applied to the data bus along with the data from the required logic unit. A parity checking circuit receives the data and the parity check digit from the data bus and a calculation is made by the circuit to determine if parity is correct. If parity is not correct, the checking circuit produces an alarm to alert the user. There is considerable art relating to redundancy in the chips themselves and in the memory systems for means for keeping track of the good cells in memory chips and the bad cells. Examples of these are U.S. Pat. Nos. 4,376,300 of Tsang, 4,380,066 of Spencer et al, 4,688,219 and 4,768,193 of Takemae.