In a digital system, data are usually stored in a digital memory in the form of binary values called bits. Errors can appear in the stored data and can be transient or permanent, as explained below. If these errors are not corrected or masked, they can generate operating errors and finally the failure of the system.
Transient errors are produced by interference with the environment or are due to the intrinsic features of memories produced using nanoscale technology.
Permanent errors are the consequence of defects in the physical structure of the circuits, these defects appearing during the production of the circuits and/or because of aging. A high density of hardware defects in a memory system translates into a large number of permanent errors.
In order to guarantee an acceptable level of integrity of the stored data and/or to increase efficiency of production, certain electronic systems use codes, usually denoted by the acronym ECC standing for “Error Correcting Codes” or EDAC standing for “Error Detection And Correction” codes.
In digital memories benefiting from protection of the ECC type, the data are encoded upon being written into the memory. Upon the encoding of data with an ECC code, verification bits, also called redundancy bits, are added to the data bits in order to form code words.
The code words of a linear correcting code are defined using a parity matrix H. A binary vector V is a code word only if its product with the matrix H generates a zero vector.
Upon reading the data present in a memory, each linear code word V is verified by evaluating the dot product HV. The result of this operation is a vector also called syndrome. If the syndrome is a zero vector, the code word is considered correct. A non-zero syndrome indicates the presence of at least one error. If the syndrome makes it possible to identify the positions of the affected bits, the code word is corrected.
Various linear ECC codes can be employed with various error detecting and correcting capacities. By way of example, Hamming encoding makes it possible to correct a single error, i.e. an error that affects only one bit. This correction capacity is known as SEC, an acronym for the expression “Single Error Correction”.
Another example of an ECC code is the DEC code, an acronym for the expression “Double Error Correction”. A DEC code allows the correction of double errors, i.e. errors affecting two bits in a code word. Of course, the codes of this family are also capable of correcting a single error.
The theoretically attainable correction capacity of the ECC codes usually used is rarely used. This is due to the fact that the number N of data bits in a code word stored in a memory is usually a power of 2. In other words, N=k*2n, in which expression n and k are integers. N is therefore a multiple of a power of 2.
By way of illustration, if a SEC code is used, the number c of verification bits must satisfy the following condition:2c−1≧N+c  (1)
If the data words to be protected are such that N=32, then the number of verification bits required is c=6.
The number Δ of syndromes unused by a linear SEC code for correcting single errors can be determined by using the following expression:Δ=2c−N−c−1  (2)
In the case of the SEC code chosen as an example, this number of unused syndromes is Δ=25.
This number Δ of unused syndromes means that up to Δ different multiple errors can be corrected in addition to the single errors that can be corrected by the SEC code.
More generally, the number Δ of syndromes unused by a linear ECC code as syndromes allowing the correction of a number EC of single or multiple errors can be determined using the following expression:Δ=2c−EC−1  (3)
In the case of a DEC code used to protect words comprising N=32 data bits, the use of 12 verification bits is necessary. A code word therefore has a size of 44 bits. As a consequence, a DEC code must be able to correct 44 single errors and (43×44)/2=946 double errors. As a consequence, a number of EC=990 single and double errors are correctable and the expression (3) indicates that the unused correction capacity is of Δ=212−990−1=3105 errors.
For a given number of verification bits, the theoretically attainable correction capacity not being entirely used. Thus, the remaining margin can be made use of in order to allow the correction of certain additional errors. In the description, the expression “base code” refers to a linear ECC code that an attempt is made to modify by improving its correction capacity. This improvement is made possible by implementing a mechanism of correction of so-called additional errors. An additional error refers to an error that is not corrected by the base code but is corrected once said code has been improved.
Thus, certain double errors can be corrected in the case of SEC codes and certain triple errors can be corrected in the case of DEC codes.
SEC codes allowing the correction of single errors as well as the correction of a restricted number of double errors have notably been the subject of several patents. By way of example, the reader is referred to patents U.S. Pat. No. 3,755,779 and U.S. Pat. No. 3,328,759 in which coding methods are described in which an additional verification bit is added to the number of verification bits necessary for implementing a traditional SEC code. In these examples, double errors affecting pairs of bits of adjacent positions are corrected. The theoretically attainable correction capacity is however far from being entirely used.
DEC codes allowing the correction of single and double errors as well as the correction of a restricted number of errors affecting more than two bits have been proposed in the article by S. Shamshiri and K.-T. Cheng Error-locality-aware linear coding to correct multi-bit upsets in SRAM, IEEE International Test Conference, 2010. The use of this type of code allows the correction of errors involving more than two bits of adjacent positions in the code word or of errors present in a restricted group of bits of adjacent positions. Here again, the theoretically attainable correction capacity is far from being attained.