This invention relates generally to soft errors in computing storage devices, and more particularly, to circuitry to recover from soft errors in register files.
Soft errors are phenomena seen in electronic devices when an extraneous charge is introduced into the system, causing an incorrect value to be observed on a signal or in a storage element. Some sources of the extraneous charge may include alpha particle emission from radioactive decay in circuit packages, or neutron flux from cosmic rays or environmental radiation. Storage elements such as register files and memory arrays are particularly susceptible to soft errors due to the increased likelihood that this transient disturbance will be captured by the storage element, as compared to an event on a combinatorial circuit node, which will propagate to the next storage element downstream, but is less likely to be captured at just the right moment. Remedial schemes are used in computing devices to detect and correct soft errors. Error detection refers to the act of ascertaining that a disturbing event has occurred, while error correction refers to the process of reproducing the original, uncorrupted data pattern.
Parity bits are error detection codes used to detect the corruption of other bits within a group of bits, which they monitor, due to soft error events in static hardware resources. A parity bit contains no information as to the individual values of the bits it represents, but rather indicates whether there are an even or odd number of “ones” in the group of bits to which it is associated. If an odd number of bits within the group (including the parity bit itself and the group of bits it represents) get corrupted, the parity bit will no longer represent the even-ness (or odd-ness) of the group, thereby indicating an error. However, the original pattern of ones and zeros cannot be reconstructed purely from this knowledge alone.