1. FIELD OF THE INVENTION
The invention relates to error detection and correction systems, and more particularly, to an error detection and correction circuit for storage and retrieval of data in computer memories.
2. DESCRIPTION OF THE RELATED ART
In a computer system, data transmitted to and from the memory can be erroneous due to a variety of factors, such as faulty components, inadequate design tolerances and noise in the communications channel or bus. As the size of the memory increases, more components are present and subject to failure, and the mean time between failures usually diminishes. Thus, in a large memory system, the potential frequency of errors becomes a significant hazard and the errors are almost impossible to prevent.
To prevent corrupted data from being used, computer manufacturers incorporate error detection and correction circuitry into computer memory systems. Numerous methods have been developed and implemented, but the simplest and most well known error detection code is a single bit parity code. To implement a parity code, a single bit is appended to the end of the data word stored in the memory. For even parity systems, the value of the parity bit is assigned so that the total number of "1"s in the stored word including the parity bit is even. For odd parity, the parity bit is assigned so that the total number of "1"s is odd. When the stored word is read, if one of the bits is erroneous, the desired parity will not be achieved. The parity error is detected by comparing the stored parity bit in the memory to a regenerated check bit calculated for the data word as it is retrieved from the memory.
One of the limitations of using the single bit parity code is that only single bit read errors can be detected. For example, if a two-bit error occurs, the parity value for the data remains the same as the stored parity bit because the total number of "1"s in the word stays odd or even. In addition, even though an error may be detected, the single bit parity code cannot determine which bit is erroneous and therefore cannot correct the error.
To provide error correction and more effective error detection, various error correction codes were developed which not only determine that an error has occurred, but also indicate which bit is erroneous. The most well known error correction method is the Hamming code. Many variations of the Hamming code have been developed. Basically, the Hamming code involves appending a series of check bits to the data word as it is being stored into the memory. When a data word is read, the retrieved check bits are compared to regenerated check bits calculated for the retrieved data word. The results of the comparison indicates whether an error has occurred and if so which bit is erroneous. By inverting the erroneous bit, the error is corrected. In addition, a Hamming code can detect two bit errors, which would escape detection under a single bit parity system.
In the Hamming code, the number of check bits represented as C required to achieve single error correction for an N-bit data word is calculated according to Equation 1. Thus, for a 16-bit data word, i.e., N=16, the number of check bits required must be greater than or equal to 5. EQU K.gtoreq.LOG.sub.2 (N+K) (1)
For a 32-bit data word, the number of check bits required must be greater than or equal to 6. For a 64-bit data word, the number of check bits required must be greater than or equal to 7.
In high performance computer systems, the data width of the memory array typically varies between 32, 64, 128 or more bits. With the advent of such processors as the Pentium Microprocessor from Intel Corporation, which is a 64-bit microprocessor, the desired minimum memory data width is 64 bits. Generally, in high performance computer systems, data is transferred between the memory array and the microprocessor through data buffers, which are typically implemented with ASICs. The use of ASICs allow the data registers to be located on the same chip with associated circuitry which perform multiplexing, parity checking and error correction functions. However, due to the large data width of the memory array, multiple ASICs are utilized to reduce the number of pins required on the ASICs. This is done to reduce cost as ASICs with larger pin counts are generally much more expensive then ASICs with lower pin counts.
Thus, if the data buffer is implemented with two or more chips, each chip must include its own error correction circuitry. Because the error correction code algorithm is developed for a predefined memory data width, the straightforward solution would be to have each data buffer chip provide and receive all of the check bits to perform the proper error correction and detection function. However, having to provide all of the check bits to each individual data buffer chip increases the number of I/O pins required on the chips. In addition, extra loading is placed on the check bits. Therefore, an alternative method of implementing the error correction circuitry is desired that would reduce the number of pins required on the data buffer chips and that would not place extra loading on the check bit signals.