Transient faults, also referred to as “soft-errors” or “single-event upsets” (SEUs), are intermittent faults that do not consistently occur. Generally, these faults are caused by external events such as neutron and alpha particles striking, or power supply and interconnect noise. Although these faults do not cause permanent damage, the faults may result in incorrect program execution by altering signal transfers or stored values.
Protection against soft-errors is generally limited to high-availability systems and safety-critical applications; however, new trends in microprocessor manufacturing are pushing these faults under the spotlight. Transistors are becoming increasingly faster and smaller with tighter noise margins, making processors more susceptible to soft-errors. Indeed, soft-errors are already changing the way the industry looks at processors design. Major customers have been lost due to server crashes caused by soft-errors; and the fear of cosmic ray strikes led an original equipment manufacturer (OEM) to protect most of the hardware logic of a recent chip design with some form of error detection.
Most modern microprocessors already incorporate mechanisms for detecting soft-errors. Memory elements, particularly caches, are protected using mechanisms such as error-correcting codes (ECC) and parity. The protection is typically focused on memory because the techniques are well understood and do not require expensive, extra circuitry. Moreover, caches take up a large part of the chip area in modern microprocessors.
Recent studies show that in a near future the soft-error rate in combinational logic will be comparable to that of memory elements; and protecting the entire chip, instead of only the memory elements, will be on top of designers' to do lists. Several works have investigated redundancy techniques to provide protection and reliability against soft-errors. Hardware-based approaches generally rely on inserting redundant hardware, such as duplicating functional units or even the entire processor.