Microprocessors rely on instructions and data, which are stored in various types of memory, for operation. The integrity of the instructions and data is imperative for proper operation. An error in a single bit of the instructions or data may lead to operation errors ranging from an errant function to a complete malfunction. Accordingly, efforts are often taken to ensure the integrity of the data by checking for errors before the data is used. A common error detection technique employs a parity bit for every x number of bits. For example, each byte of data may be associated with a parity bit. The value of the parity bit is calculated based on the values of each bit in the byte and is stored in association with the byte. Assuming single bytes may be written at any given time, each time the byte is written, a new parity bit value is calculated and stored in the parity bit. Each time the byte is read, the values in each bit of the byte are used to generate a new parity value, which is then compared to the parity value stored in the parity bit. If the new parity value and the parity value stored in the parity bit do not match, at least one of the values of the bits in the byte is in error.
Such parity schemes are relatively easy and efficient to implement, as the extra memory and processing resources required to implement parity checking is relatively low. Unfortunately, basic parity schemes only allow for the detection of errors and do not allow for the correction of errors. As such, parity schemes are generally used for data that is normally stored in multiple locations, such that when an error is detected for data in one location, an error free version of the data can be obtained from the other location.
When errors in data must be corrected, error detection and correction (EDAC) techniques may be employed. EDAC techniques generally employ error correction codes (ECC), such as the Hamming or Hsiao codes described in M. Hsiao, “A Class of Optimal Minimum Odd-weight-column SEC-DEC Codes”, IBM J. Res. Dev., 14, pp. 395-401, July 1970, and C. Chen et al., “Error-Correcting Codes for Semiconductor Memory Applications: A State-of-the-Art Review”, IBM J. Res. & Dev., 28, 2, pp. 124-134, March 1984, which are incorporated herein by reference in their entireties. In an ECC scheme, r values for the check bits are calculated and stored in association with every bundle of z data bits during write operations. For read operations, the check bits generally allow the detection of single- or double-bit errors and the correction of single-bit errors. An exemplary read operation may flow as follows. Initially, the values of data bits are read and used to calculate new check bit values, which are logically exclusive-ORed with the corresponding values stored in the check bits to generate an r bit syndrome. If the values of the syndrome are all logic 0s, the data is valid. If any value of the syndrome is a logic 1, the data is corrupted. The nature of ECCs allows single-bit errors to be corrected and multiple-bit errors to be detected.
An example of data and check bit storage is provided in FIG. 1. An N×32 bit array 12 is illustrated where each bundle of 32 bits represents a row of data, which is associated with the required seven check bits. As illustrated, the number of check bits required for each bundle is inefficient and creates a significant penalty in the memory required to implement this EDAC scheme. A processing penalty is also imposed, as the large number of check bits for a bundle must be calculated each time a bundle is written. Partial, or narrow, writes require essentially the same amount of processing as a complete write. A partial write is one in which only a portion of the bundle is rewritten, such as when only an eight-bit byte is written within the 32-bit bundle. For a partial write, the entire 32-bit bundle is read. The eight-bit byte to be written is inserted into the bundle, and updated check bit values are calculated based on the entire bundle. The updated check bit values are then stored in the associated check bits.
Although conventional EDAC schemes are very effective, the processing and area penalties associated with their use limit their application. As such, usually only larger memories and second level (Layer 2) cache in high performance processor applications employ EDAC schemes. These performance penalties generally preclude the use of EDAC schemes in register files, and limit the application of EDAC schemes in first level (Layer 1) caches. As such, many processor applications are forced to rely on parity schemes, and if EDAC schemes are employed, they are only used for memories where processor overhead, access speeds, and overall size are of limited importance.
Bit errors in memory may be attributed to any number of causes. One of the more difficult causes to guard against is radiation. Electronic circuits, and in particular memory circuits, are vulnerable to high-energy sub-atomic particles and other types of electromagnetic radiation. Many high-altitude flight, outer space, military, nuclear, and various commercial applications require that such vulnerability be reduced to an acceptable level. Employing techniques to reduce the vulnerability of these circuits to the effects of radiation is generally referred to as radiation hardening. For the most part, radiation hardening involves employing special circuit designs, circuit layouts, the use of select materials, or any combination thereof to increase the robustness of the circuit. Long-term radiation effects that may impact the long-term functionality of the circuit are referred to as total ionizing dose (TID) effects and are often mitigated using special semiconductor material and layer organization techniques.
Another type of radiation effect that is becoming more of an issue is referred to as a single event effect (SEE). When a high-energy particle passes through a semiconductor providing the electronic circuit, excess charge may be left in the semiconductor along the path through which the particle passes. If excess charge is left on or near a node that is charged to a level representing a desired logic state, the excess charge may change the level of charge at the node. The change in charge level may result in the node changing from the desired logic state to another logic state.
For example, if a particle of ionizing radiation passes through a circuit node that is charged for a logic 1, excess electrons from the ionizing particle track may collect at the storage node and discharge the storage node to a charge level corresponding to a logic 0. The effect of the change in charge level of the storage node may result in a temporary transient, where the storage node returns to the charge level for a logic 1 and does not upset the overall output of the of the electronic circuit. This type of SEE is referred to as a single event transient (SET). If the effect of the change in charge level of the circuit node changes the logical state of the circuit by affecting a storage node, where the storage node does not return to the proper state, a single event upset (SEU) is said to occur. Generally, SETs and SEUs are temporary, unlike TID effects, which are long-term radiation effects.
EDAC schemes may be employed to detect and correct certain bit errors caused from radiation effects. However, such schemes are limited in the number of bit errors that can be corrected for a given word or portion of a word. Without adequate radiation hardening, a radiation event may result in more bit errors than the EDAC scheme is capable of detecting, let alone correcting. Accordingly, there is a need for an EDAC scheme that is significantly more efficient than conventional EDAC schemes. For environments subjected to radiation, there is also a need for an efficient and effective EDAC scheme that is conducive to radiation hardening. There is a further need for an EDAC scheme that can be efficiently applied in first level caches, register files, and other memories used in high performance applications, especially memories that must support finer granularity, e.g., byte width writes.