1. Field of the Invention
The present invention relates generally to data processing, and in particular to a computer implemented method, apparatus, and computer usable program code for preventing soft error accumulation in register arrays.
2. Description of the Related Art
Use of data processing systems has grown exponentially in recent years because of the increased use of computing devices. Users have come to rely on data processing systems in every aspect of business and society. With this reliance, preventing soft errors becomes increasingly important to a system's overall performance.
Soft errors refer to errors caused by a temporary disruption of an electronic component such as a register array. A soft error is an error which is not due to any permanent physical defect in the memory system and typically involves changes to data. Many soft errors are caused by radioactive decay. Radioactive decay causes a soft error by alpha particle emission. When an unstable isotope decays, the isotope emits a positively charged alpha particle. The alpha particle may travel through an electronic component such as semiconductor memory and disturbs the distribution of electrons in the semiconductor memory. If the disturbance is large enough, a digital signal can change from a 0 to a 1 or vice versa.
Additionally, soft errors are sometimes caused by cosmic rays. Neutrons within the cosmic ray may produce unstable isotopes by neutron capture which may decay and cause a soft error.
One standard method for protecting data stored in microprocessor register arrays from soft errors is parity protection or error correction code (ECC) protection. Whenever new data is written into a register array, parity or ECC is generated and stored either in the same memory arrays as the data or in a separate memory array. A register is a circuit that holds values, operations, or input operands for logic or arithmetic operations or for address computations. These are typically operations performed by a processor and registers are typically located in processors. The register may hold values, such as an address of an instruction being executed or data being processed. Examples of registers located in a processor core include general purpose registers which hold operands for logic and integer computations or address calculations, floating point registers which hold operands for floating point computations, program counter registers which point to the locations in the memory for fetching instructions, conditional registers which hold values used for calculating conditions for branches, various special purpose registers, such as interrupt vector register, machine status register, link registers. Registers may also be located in any other component of the computers system, such as cache, memory controller, Input/Output controller, network adapter, fabric logic.
For parity protected arrays, whenever data is read out of the register file, the parity bit is calculated and compared against the corresponding parity bit read out from the appropriate parity storage array. In case of a mismatch, an error is reported and the processor takes an appropriate corrective action of check stops.
For ECC protected arrays, whenever data is read out of the register file, the register file that stores the ECC bits supplies the ECC bits corresponding to the data item, and the read data undergoes ECC correction.
The common problem with error protection mechanisms is that only a limited number of bit flips in any data item may be detected or corrected. For example, in the case of a parity protected data item, any even number of bit flips does not change the parity function, and therefore will go undetected in some systems. In other systems even if the even number of bit flips is detected, the error may not be correctable. This problem is further referred to as soft error accumulation. Soft error accumulation often results in data errors such as computational errors, application failures, and necessary reboots of a data processing system.