Normal background radiation environment on the surface of the earth has ionizing components that sometimes affects the reliability of semiconductor integrated circuit chips, such as memory chips used in computers. If an intruding particle is near a p-n junction in the chip, it may induce a soft error, or single-event upset which can cause signals to change voltage and, accordingly, bits of data to change voltage value. Excess electron-hole pairs may be generated in the wake of the penetrating particle. The electric field in the neighborhood of the p-n junction, if sufficiently strong, separates these electrons and holes before they recombine, and sweeps the excess carriers of the appropriate sign to a nearby device contact. A random signal may be registered if this collected charge exceeds a critical threshold value.
Cosmic particles in the form of neutrons or protons can collide randomly with silicon nuclei in the chip and fragment some of them, producing alpha-particles and other secondary particles, including the recoiling nucleus. These alpha-particles and other secondary particles can travel in all directions with energies which can be quite high (though of course less than the incoming nucleon energy). Alpha-particle tracks so produced can sometimes extend a hundred microns through the silicon. The track of an ionizing particle may extend a fraction of a micron to many microns through the chip volume of interest, generating in its wake electron-hole pairs at a rate of one pair per 3.6-eV (electron-volts) loss of energy. A typical track might represent a million pairs of holes and electrons.
Shielding of devices, to protect from cosmic rays, may be impractical because it may require tens of meters of concrete to remove a cosmic ray. Additionally, a major contributor to soft errors is cosmic ray neutrons. Cosmic ray induced computer crashes have occurred and are expected to increase with frequency as devices (for example, transistors) decrease in size in chips. This problem is projected to become a major limiter of computer reliability in the next decade.
The contribution of random logic in processors is becoming dominant and no cost effective method of mitigation is known. Cache memories are already protected with ECC (error correction code), and logic arrays are relatively easy to protect with parity. Logic latches are very costly to protect as most methods use replication of logic to gain redundancy. Even an order of magnitude improvement in reliability may require a significant penalty in performance, power, area and cost.
Various approaches have been suggested to eliminate or reduce the number of soft errors due to cosmic ray interactions in chips. None of these approaches is completely successful, particularly as device size continues to decrease. Another approach is to accept that some soft errors will happen and to design memory and logic circuitry to include redundancy in all calculations. This approach involves more gates and enough spatial separation between contributing redundant elements to avoid mutual soft errors from the same cosmic ray. This approach is not practical for many chips.