The invention relates generally to memory systems and more specifically to correction of errors within memory systems.
Memory systems made using large scale integrated circuit techniques have proven to be cost effective for certain applications of storing digital data. Most memory systems are comprised of a plurality of similar storage devices or bit planes each of which is organized to contain as many storage cells or bits as feasible in order to reduce per bit costs and to also contain addressing and read and write circuits in order to minimize the number of connections to each storage device. Because of the one bit organization of the storage device, single bit error correction as described by Hamming in the publication "Error Detecting and Correcting Codes", R. W. Hamming, The Bell System Journal, Volume XXIX, April, 1950, No. 2, pp. 147-160 has proven quite effective in correcting the error of a single cell or bit in a given word, i.e., a single bit error, the word being of a size equal to the word size of the memory system. This increases the effective mean time between failure (MTBF) of the memory system.
Because the storage devices are quite complex, and because many are used in a memory system, they usually represent the predominant failure in a memory system. Consequently, it is common practice to employ some form of single bit error correction along the lines described in Hamming. While single bit error correction allows for tolerance of single bit failures, as more of them fail, the statistical chance of finding two of them, i.e., a double bit error, in the same word increases with the errors being left to accumulate in the system. While the method to accomplish double bit error correction as suggested by Hamming has been known in the art for some time, the cost of the additional circuitry required has made the technique economically unfeasible for most commercial applications.
Considerable research has been directed toward solving the problem of multiple bit error correction as the economics of the semiconductor technology tend to force the utilization of larger and larger storage devices containing malfunctions for individual bits.
Many techniques are known which accomplish multiple bit error correction in memory systems utilizing single bit error correction as taught by Hamming. One such technique is taught by J. H. Scheuneman, et al, in pending United States Patent Application Serial No. 871,048, now U.S. Pat No. 4,163,147, assigned to the assignee of the present invention. J. H. Scheuneman, et al, teach the complementing of all combinations of two bit positions of the erroneous data word until the single bit error correction circuitry of the memory system indicate no error (or a correctable single bit error) is present. For a memory system having a word size of N bits, however, as many as N(N-1)/2 iterations of the complementing process may be required.
This may require a long time to correct errors in memory systems having a large word size (i.e., N is large).
A second technique is taught by J. H. Scheuneman, et al, in pending U.S. Application Ser. No. 827,540, now U.S. Pat. No. 4,139,148, also assigned to the assignee of the present invention. J. H. Scheuneman, et al, herein teach the storing or logging of the syndrome bits of a single bit (correctable) error in a location corresponding to the addressable location wherein that single bit error was observed. A subsequent second failing bit position at the same addressable location may then be corrected by the single bit error correction circuitry after first complementing the initial failing bit position as identified by the stored syndrome bits. This technique though faster requires a substantial amount of additional storage capacity to store the syndrome bits for each addressable location. The technique furthermore assumes that the multiple bit errors are not first observed on a single read cycle.
A similar technique is taught in publication, "Development of a Space Borne Memory with a Single Error and Erasure Correction Scheme," C. J. Black, et al, published at the 7th Annual International Conference on Fault Tolerant Computing by the IEEE Computer Society, Los Angeles, California, June 28-30, 1977, Pages 50 through 55. Whereas C. J. Black, et al, teach storage of the syndrome bits in nondedicated memory reducing the total additional memory requirement, this savings probably substantially reduces the reliability of the technique because it restricts the number of addressable locations for which multiple bit errors may be corrected.
A number of techniques are also employed which are intended to prevent the occurrance of or forecast multiple bit errors. R. D. Rothenberger in pending U.S. Patent Application, Ser. No. 886,362, now abandoned also assigned to the assignee of the present invention, teaches the relocation of data from those addressable locations identified to contain an error thereby attempting to prevent multiple bit errors. This technique does require additional memory capacity for the relocated data, however, and assumes that single bit errors will be observed at an addressable location before a multiple bit error occurs.
Error logging is the technique used to attempt forecasting of multiple bit errors. Petschauer in U.S. Pat. No. 3,999,051 describes such an error logging scheme. Error logging does not, however, correct multiple bit errors.
The present invention provides actual multiple bit error correction utilizing a minimum of additional hardware within a minimum amount of time.