1. Field of the Invention
The present invention generally relates to error checking and correction (ECC) in general purpose digital computers and, more particularly, to an implementation of pipelined ECC in cache memories.
2. Description of the Prior Art
Cache memories are high-speed buffer storage containing frequently accessed instructions and data. Such memories are typically used between the central processing unit (CPU) of a computer system and bulk or main storage and/or direct access storage devices (DASDs) which have a much longer access time than the cache memories. Thus, the purpose of cache memories is to reduce access time and thereby maximize the processing efficiency of the CPU.
As the cache size increases and memory cell size decreases, the soft error rate of the cache can increase significantly. This poses an important reliability problem for many applications.
ECC schemes for single error correction and double error detection that guard against soft errors for most cache applications are known in the art. To write, the data goes to an ECC-bit generation circuit (e.g., Hamming circuitry if a Hamming code is used) to generate ECC bits which are then written into the cache array with the data. To read, the data and the ECC bits go to the Hamming circuitry to generate the syndrome bits which are used as the input to a decoder. The output of the decoder is then exclusive-ORed with the data to correct any single incorrect bit. Error flags for single error and double error are also generated by the decoder. When a single error flag is detected, the corrected data and address are latched. The single error in the cache array is then corrected by writing the corrected data back into the cache array. When a double error flag is detected, the system may not be recoverable; however, the probability of that happening is so low that it is not a practical concern.
There is one more complication for the implementation of the ECC, and that is the minimum unit for STORE is a byte for most general purpose digital computer architectures (e.g., IBM S/370, DEC VAX, Intel 80386, etc.). Therefore, a straightforward implementation of the ECC cache is to use the 13/8 Hamming code to have separate check bits (ECC bits) for error correction on each byte. However, the overhead of this approach is very large. Error correction on wider data is more practical; e.g., the 72/64 or 137/128 Hamming code implementation.
FIG. 1 illustrates the data flow for a straightforward implementation of an ECC cache with 72/64 Hamming code. The operation of this cache is best understood by going through the data flow for a STORE operation. Assuming what we want to do is to store a byte which is the second byte in a double word with address A3, the operations needed to accomplish this are as follows:
1) Read out the double word and the associated check bits from the cache array 12 at address A3 (current content of the memory address register (MAR) 11);
2) Run through the Hamming circuitry ECC2 13;
3) Merge at 14 with the data from the CPU; i.e., replace the second byte of the data with the new data from the CPU;
4) Generate the check bits by Hamming circuitry ECC1 15; and
5) Write the result (the new double word and associated check bits) back to the cache array 12.
The problem of this implementation is that the critical path for the STORE operation becomes very long, and the cycle time increase is not acceptable.
An example of an error correction technique known in the prior art is disclosed by H. T. Olnowich et al. in "Error Correction Technique Which Increases Memory Bandwidth and Reduces Access Penalties", IBM Technical Disclosure Bulletin, vol. 31, no. 3, August 1988, pp. 146 to 149. This technique, however, requires a dual redundant memory implementation and is intended to replace more costly, high-speed memory (static random access memory chips or SRAMs) with two banks of lower performance memory (dynamic random access memory chips or DRAMs). It is therefore not suitable for high performance cache applications. F. Tsui in "Memory Arrangement and Operation for Error Correction Without Cycle-time Prolongation", IBM Technical Disclosure Bulletin, vol. 16, no. 10, March 1974, pp. 3280, 3281, discloses a scheme in which a memory array for data bits is separated from a memory array for check bits, the two arrays being driven separately with the latter array having a time delay lagging the former. On a read operation, the data is assumed to be correct, and if an error is detected, the initiated operation with the uncorrected data is recalled and a new operation is started using corrected data. In this scheme, separated check bits for error correction on the minimum unit of STORE operation are required. Since most state-of-the-art architectures for digital systems (e.g., 370, VAX, 80386, etc.) have a byte as the minimum unit of the STORE operation, the overhead of this approach is very large. It is therefore not practical.
U.S. Pat. No. 4,748,627 to Ohsawa describes a memory system with an error correction function. The objective of this system is to avoid the accumulation of errors in a dynamic random access memory (DRAM), and while this is important, it addresses a completely different problem than that solved by the subject invention. U.S. Pat. No. 4,672,614 to Yoshida proposes to use two sets of address buffers. More specifically, the memory is provided with a pair of row address buffers which can operate independently, and when an error correcting operation is performed for the data related to the address contents of one of the buffers, the access operation of the data cell array is conducted by the other of the buffers, thereby enabling the memory to simultaneously carry out parts of the operation of successive read-out operations.