1. Field
The disclosed embodiments generally relate to error detection and correction mechanisms in computer memories. The disclosed embodiments also use a technique that facilitates error detection and error correction after a failure of a memory component in a computer system.
2. Related Art
Computer systems routinely employ error-detecting and error-correcting codes to detect and/or correct various data errors which can be caused, for example, by noisy communication channels and unreliable storage media. Some error codes, such as SECDED Hamming codes, can be used to correct single-bit errors and detect double-bit errors. Other codes, which are based on Galois fields, can be used to correct a special class of multi-bit errors caused by a failure of an entire memory component. For example, see U.S. Pat. No. 7,188,296, entitled “ECC for Component Failures Using Galois Fields,” by inventor Robert E. Cypher, filed 30 Oct. 2003 and issued on 7 Mar. 2007 (referred to as “the '296 patent”), which is incorporated herein by reference.
The operations described in the '296 patent are equivalent to operations on a class of polynomials with binary coefficients (referred to as “GF(2) polynomials”) modulo a GF(2) polynomial that is irreducible (referred to as an “irreducible GF(2) polynomial”). However, having to use an irreducible polynomial significantly limits the possible polynomials that can be used as the generating polynomial for a code. This limitation makes it harder to guarantee a minimum Hamming distance for the code, and this lack of a guaranteed minimum Hamming distance can make the code less useful for detecting and correcting errors in memory components.
Hence, what is needed is a method and an apparatus for detecting and correcting errors in a memory component without the shortcomings of existing techniques.