1. Field of Invention
This invention relates to multiprocessing systems and more particularly to means for interconnecting data processor modules with memory modules including means for handling faulty units such that the system can continue to operate in the presence of permanent errors.
2. Description of the Prior Art
With the advent of very large-scale integrated circuit (VLSI) technology, there has arisen a corresponding need for new fault-handling methods and apparatus. With the capability of placing a very large number of circuits in increasingly smaller areas, it has become necessary to provide comprehensive and complete fault covererage of the circuits. This means that errors must be confined and isolated to small logic blocks. In order to take the fullest advantage of VLSI technology, it is desirable that integrated circuit chips be identical and that they provide for modular interconnections. In these types of systems, most fault occurrences are independent, and two or more faults do not usually occur simultaneously. However, since there are possibly latent faults present in the system, means must be provided for handling a second fault which occurs in addition to the latent fault. Transient errors are the dominant type of fault occurrence.
It is further desirable that the propagation of errors between levels should be minimized to prevent information overload at higher levels in the system structure. The detection and recovery mechanisms should address every level of the system to provide a complete solution to the handling of system failures. This allows the fault-handling apparatus to be distributed since each level need address only the set of faults that can be generated at that level.
It is therefore a primary object of this invention to provide a modular, distributed-function, fault-handling mechanism for the detection, reporting, and recovery of transient, latent, and permanent errors.