1. Field of the Invention
The present invention relates to error correction and detection systems, methods, and procedures for digital memories.
2. Description Of The Prior Art
The increased vulnerability of high-density memory systems to both hard and soft errors has lead to greater interest in the use of error-correcting codes to improve system reliability. See Intel Corporation Memory Design Handbook, January 1981, pp 4-13 to 4-33. An approach now common in 16-bit word machines is to encode a word using a (22,16,4) single-error-correcting (SEC) double-error-detecting (DED) modified Hamming code, as suggested by Hsiao, M. Y., "A Class of Optimal Minimum Odd-Weight-Column SEC-DEC Codes, " IBM Jour. Res. Develop., July 1970, pp. 395-401. To those skilled in the coding arts, the first number 22 in the parenthetical expression (22, 16, 4) represents the total number of bits in a codeword, 16 represents the number of information bits in a codeword, and the third number 4 represents the minimum distance of the code. By "minimum distance" is meant the minimum number of positions in which any two codewords must have different values.
Supporting modified Hamming code chips are commercially available (e.g. Texas Instruments SN54/74S630). Elkind and Siewiorek suggest the use of a block code in a similar way in Relability and Performance of Error-Correcting Memory and Register Arrays, IEEE Trans. Comp., Vol. C-29 No. 10, October 1980 pp. 920-927. Bossen and Hsiao discuss the use of a longer code of the same type in conjunction with a more powerful microcode and hardware correction algorithm. See Bossen, D. C. and Hsiao, M. Y. A System Solution to the Memory Soft Error Problem, IBM Jour. Res. Develop., Vol. 24, No. 3, May 1980, pp. 390-397.
U.S. Pat. No. 4,277,844 issued July 7, 1981 to Robert J. Hancock, et al. discloses an error detection and correcting method employing a combination of a Hamming code and a cyclic redundancy code.
Barton in his thesis A Fault Tolerant Integrated Cirouit Memory, Ph.D. Thesis, Computer Soience, Calif. Inst. Tech., April 1980, proposes the construction of a hierachical memory based on multi-layer error-correcting coding that can tolerate numerous faults. His design constitutes an architectural realization of the product codes of Elias described in Error-free Coding, IRE Trans. Inform. Theory, Vol. PGIT-4, pp. 29-37, September 1964.
A product code is a rectangular array where each row in the array forms a codeword in some error correcting code and each column forms a codeword in some error correcting code. Both codes are usually linear.
All of these approaches have a relatively high coding overhead inasmuch as a substantial fraction of the physical memory is devoted to the storage of redundant parity checks. This arises from the use of relatively short coding blocks chosen for the sake of keeping the encoding and decoding times short, since these times are a substantial part of the resulting memory access times.
The use of an error-correcting code for protecting a memory inevitably involves a trade-off between coding efficiency on the one hand, and computation and communication costs on the other. As is well known in coding theory, to correct a fixed number of errors with very little coding overhead the block length of the code must be large, enabling dependencies to be distributed over many bits. Correspondingly, however, the computation of the correct value of any bit of the codeword must involve a large number of the bits of the codeword, implying in turn the need for communication lines to access the bits and delay in the computations that interrelate them.