Error correcting codes (ECC) have been routinely used for fault tolerance in computer memory subsystems. The most commonly used codes are the single error correcting (SEC) and double error detecting (DED) codes capable of correcting all single errors and detecting all double errors in a code word.
As the trend of chip manufacturing is toward a larger chip capacity, more memory subsystems will be configured in b-bits-per-chip. The most appropriate symbol ECC to use on the memory are the single symbol error correcting (SbEC) and double symbol error detecting (DbED) codes, wherein “b” is the width (number of bits in output) of the memory device, that correct all single symbol errors and detect all double symbol errors in a code word.
A memory designed with an SbEC-DbED code can continue to function when a memory chip fails, regardless of its failure mode. When there are two failing chips that line up in the same ECC word sometime later, the SbEC-DbED code would provide the necessary error detection and protect the data integrity for the memory.
Existing and imminent memory systems utilize eighteen memory devices. However, the present SbEC-DbED error correcting codes utilize 36 memory devices in order to provide chip fail correction and detection. Thus, the cost increases due to the added expense of 36 memory devices for error correcting purposes and they are inflexible because they do not scale (adapt) to the memory systems with eighteen memory devices. Furthermore, the various circuits for encoding and decoding the errors are complex. Thus, increasing the cost and design of computer systems to insure data integrity.