This invention relates to a fault-tolerant computer system.
The traditional approaches to system reliability attempt to prevent the occurrence of faults through improved design methodologies, strict quality control, and various other measures designed to shield system components from external environmental effects (e.g., hardening, radiation shielding). Fault tolerance methodologies assume that system faults will occur and attempt to design systems which will continue to operate in the presence of such faults. In other words, fault-tolerant systems are designed to tolerate undesired changes in their internal structure or their external environment without resulting in system failure. Fault-tolerant systems utilize a variety of schemes to achieve this goal. Once a fault is detected, various combinations of structural and informational redundancy, make it possible to mask it (e.g., through replication of system elements), or correct it (e.g., by dynamic system reconfiguration or some other recovery process). By combining such fault tolerance techniques with traditional fault prevention techniques, even greater increases in overall system reliability may be realized.