The present invention is generally directed to symbol level error correction coding and decoding methods and circuits. More particularly the present invention is directed to a symbol level code which exhibits error correction and detection advantages insofar as there is a reduction in circuit complexity. The present code is also structured so as to provide improved error correction and detection features in the face of memory chip and/or bus line failures.
The IBM iSeries server is a computer system designed and manufactured by the assignee of the present invention. It has been found that there is a need for a memory controller which is required to support 65 data bits using standard 72 bit DIMMs (Dual In-line Memory Modules). The SDRAM chips that are employed in this design are configured to provide as output, and also to receive for storage, 4 bits per chip. The error correction code (ECC) that is desired for this system should be able to correct all errors generated from single chip failures and to detect all errors generated from double chip failures. A single symbol error correcting and double symbol error detecting (SSC-DSD) code for 4 bits per symbol and 65 data bits requires a minimum of 14 check bits. The ECC word for such a code would require 79 bits, a quantity sufficiently high so as to exceed the 72 bit limitation on the DIMMs to be used. Using two DIMMS with 130 data bits would leave 14 bits for ECC check bits (72′2-65′2). However, SSC-DSD codes with 15 check bits and 130 data bits for 4 bits per symbol do not exist.
More generally, for a memory array configured using an m-bit-per-chip structure, the ability for an ECC circuit to correct multi-bit errors (symbol errors) generated from chip-kills or from half chip-kills becomes an important factor in the ECC design. For memory reliability and for data integrity, the class of SSC-DSD (single symbol error correcting and double symbol error detecting) codes are very much desired, where a symbol error is an m-bit error pattern that can be generated from a chip failure in an m-bit-per-chip memory configuration.