The present invention is generally directed to the encoding and decoding of binary data so as to employ error correction coding (ECC) methods for symbol error correction and detection. Apparatus for decoding such codes is also provided and implemented in a fashion which exploits code structure to provide encoders and decoders which minimize circuit cost, particularly with respect to the number of Exclusive-OR gates needed. More particularly, an embodiment employing three check symbols is presented.
The utilization of error correction and detection codes in electronic data processing and transmission systems is becoming more and more important for several reasons. In particular, increased problem complexity and security concerns require ever increasing levels of reliability in data transmission. Furthermore, the increased use of high density, very large scale integrated (VLSI) circuit chips for use in memory systems has increased the potential for the occurrence of soft errors such as those induced by alpha particle background radiation effects. Furthermore, the increased use of integrated circuit chips has led to memory organizations in which each output memory word is derived from multiple bit patterns received from a plurality of different circuit chips. Accordingly, it has become more desirable to be able to protect memory system integrity against the occurrence of chip failures since such failures produce multi-bit errors which are associated with a single chip. It is thus seen that it is desirable to employ error correction coding systems which take advantage of the memory system organization itself, particularly with respect to minimizing the probability of error. For this reason, it is desirable to employ codes which are based upon symbols or bytes of data. (It is noted that, as used herein, the term “byte” does not necessarily refer to an 8-bit block of binary data but rather is more generic.)
As noted, error correcting codes have been applied to semiconductor memory systems to increase reliability, reduce service costs and to maintain data integrity. In particular, single error correcting and double error detecting (SEC-DED) codes have been successfully applied to many computer memory systems. These codes have become an integral part of the memory design for medium and large systems throughout large portions of the computer industry.
However, the error control effectiveness of an error correction code depends on how the memory chips are organized with respect to the code. In the case of single error correcting and double error detecting codes, the one bit per chip organization is the most effective design. In this organization, each bit of a code word is stored in a different chip, thus any type of failure in a chip can corrupt at most one bit of the code word. As long as the errors do not line up in the same code word, multiple errors in the memory are correctable in this scheme.
As the trend in chip design has continued toward higher and higher density, it has become more difficult to design one bit per chip types of memory organizations because of the system granularity problem. For example, the system capacity has to be at least four megabits if one megabit chips are used to design a memory with a 32 bit data path. However, in a b bit per chip memory, a chip failure may result in from 1 to b bit errors, depending upon the type of failure. A failure could be a cell failure, a word line failure, a bit line failure, a partial chip failure or a total chip failure. With a maintenance strategy that allows correctable errors in the memory to accumulate, SEC-DED codes are not particularly suitable for b bit per chip memories. Since multiple bit errors are not correctable by SEC-DED codes, the uncorrectable error rates can be high if the distribution of chip failure types is skewed to those types that result in multiple bit errors.
A more serious problem is the loss of data integrity due to the miscorrection of some multiple errors. It is therefore seen that for byte organized memory systems, it is desirable to employ single byte error correcting and double byte error detecting (SBC-DBD) codes in b bit per chip memories. With such codes, errors generated by a single chip failure are always correctable and errors generated by a double chip failure are always detectable. Thus, the uncorrectable error rate can be kept low and the data integrity can be maintained. Discussions concerning codes that are suitable for this purpose can be found in the article “Error-Correcting Codes for Byte-Organized Memory Systems”, by Chin-Long Chen in Volume IT-32, No. 2 of the IEEE Transactions on Information Theory (March 1986).
In the construction of error correcting codes contrary criteria are often present. The first criteria relates to the error correcting capabilities of the code itself. The second criteria relates to the ease of encoding and more particularly, of decoding the information received. More complex codes often require more complicated circuitry for decoding. With the constraints of electronic circuit chip “real estate” being at a premium with respect to circuit design, it is seen that it is greatly desirable to be able to implement both encoding and decoding circuits utilizing as few circuit gates as possible with each gate having a minimum number of attached signal lines. Accordingly, one of the objects of the present invention is the construction of minimum cost encoders and decoders without the corresponding sacrifice of correction and detection capabilities.
In particular, one of the classes of codes which is particularly applicable to the present invention is a code employing three check symbols. Such a code can be described as an extended Reed Solomon code having a parity check matrix of the form:
                    H        =                  [                                                    I                                                              T                  a                                                                              T                                      2                    ⁢                    a                                                                                                T                                      3                    ⁢                    a                                                                              …                                            I                                            O                                            O                                                                    I                                                              T                                      a                    +                    1                                                                                                T                                      2                    ⁢                                          (                                              a                        +                        1                                            )                                                                                                                    T                                      3                    ⁢                                          (                                              a                        +                        1                                            )                                                                                                  …                                            O                                            I                                            O                                                                    I                                                              T                                      a                    -                    1                                                                                                T                                      2                    ⁢                                          (                                              a                        -                        1                                            )                                                                                                                    T                                      3                    ⁢                                          (                                              a                        -                        1                                            )                                                                                                  …                                            O                                            O                                            I                                              ]                                    (        1        )            
In Equation 1 above, H is the parity check matrix, a well known concept in error correction coding. I represents the identity matrix and O represents a matrix all of whose elements are zero. In general, T is a matrix with b rows and b columns and represents the companion matrix of a bth degree primitive polynomial over the field GF(2). In one particular embodiment of this form of Reed Solomon code, the integer a is taken to be zero. Further information concerning such codes is found in the aforementioned article by the present inventor.