The present invention relates generally to error checking and correction (ECC) circuits for computer memory systems, and more particularly to circuits for detecting failures in ECC circuitry.
Each new generation of computer systems substantially increases the number of high-bit-density chips utilized in the memory. This chip increase provides a corresponding increase in memory capacity. However, such large-capacity memory systems utilizing high-density memory chips are much more susceptible to memory chip failure. The most common types of chip failures include single-cell, word-line, bit-line, and chip-fail faults. In addition to these hard faults, computer chip memories are susceptible to soft errors caused by alpha-particle radiation.
However, it has long been recognized that the integrity of the data bits stored in and retrieved from such memories is critical to the accuracy of calculations performed in the data processing system. In this regard, the alteration of one bit in a data word could dramatically affect arithmetic calculations or could change the meaning of recorded data.
Accordingly, in order to minimize the consequences of hard and soft memory-errors, error checking and correction (ECC) circuits are routinely included in computer systems. These ECC circuits typically utilize an error correcting code in the class of codes known as single-error-correcting, double-error-detecting (SEC-DED) codes. Such SEC-DED codes are capable of correcting one error per data word and detecting two errors therein. Of particular advantage are the odd-weight-column error codes because of the speed, cost, and reliability of the attendant decoding logic. Examples of such codes are disclosed in FIG. 3 of Chen and Hsiao, IBM J. Res. Develop, Vol. 28, No. 2, March 1984.
The above-described ECC circuits with their error correcting codes require the storage of a predetermined number of check bits, C.sub.j, along with the data bits, D.sub.i, in the ECC word. For example, for 64 data bits, D.sub.i, typically 8 check bits, C.sub.j, are generated by means of an error-correcting-code algorithm circuit which implements an algorithm of the type disclosed in the above article. These check bits are then stored along with the word data bits. Upon readout, the data bits, D.sub.i, read from an addressable memory location, are again run through an error correcting code algorithm circuit to generate a second set of check bits, G.sub.j. This newly generated set of check bits is compared to the memory stored check bits, C.sub.j, to obtain syndrome bits, S.sub.j. If any of these syndrome bits is a one, indicating a difference in the compared check bits G.sub.j and C.sub.j, then it is known that the stored data word contains an error. If it is a single error, then the syndrome bits, S.sub.j, any be decoded to determine the error location in the word, and the error corrected.
However, the above-described ECC circuits for generating the syndrome bits, and the additional memory necessary to store the check bits, C.sub.j, are both subject to failures. In this regard, errors can occur in the generation of the error correction code signals through circuit faults, through the erroneous recording or readback of the error correction code signals, or through read/write circuit failures. Such failures would lead to the indication of erroneous data, with the possibility of correct data being altered in the ECC circuit, when, in fact, the error occurred in the error checking and correction circuit.
The invention as claimed is intended to provide a fault detection capability for the ECC circuit, itself.
The advantage offered by the present invention is that it provides the foregoing fault detection of the ECC circuit, but without replicating, and independent of, the ECC circuit. An additional advantage is that the present invention provides this fault detection capability without the need for inserting an additional bit in the ECC word. Finally, the present circuit may be used to quickly determine if all of the data bits in an ECC word are correct a number of cycles in advance of the completion of normal ECC operations.