Computer memories typically utilize dynamic random access memories (DRAMs) for the storage of data that is used in the CPU of the computer. This data is subject to corruption due to many possible causes. As a result, memory systems have been refined and developed so that the occurrence of random errors is relatively infrequent. But, in many applications the integrity of the data is so critical that even relatively infrequent errors cannot be tolerated. Various error correction and detection codes have been developed to detect, and in certain situations, correct such errors. The most well known is the Hamming Code, which typically provides detection for one or two-bit errors and correction of one-bit errors. Commercial circuits are available which can implement such a Hamming code in computer memory systems. The technology of the Hamming Code is well known and is widely described in the literature in this field of technology.
The use of the Hamming code is quite effective in conventional memory systems in which one bit of data is stored in each memory device. Large memory systems typically include hundreds, or even thousands of individual memory devices, each of which may contain, for example, 1 megabit, 4 megabits or 16 megabits of data. DRAM devices have conventionally had a one-bit output per device such that a data word is distributed across the same number of memory devices as the number of bits in the word. Randomly occurring errors, generally referred to as soft errors, are most frequently one-bit errors. Further, if a particular memory device were to fail, either permanently or transiently, there would still be produced a one-bit error in the resulting data word. Therefore, by use of a Hamming Code a one-bit error can be detected and corrected. It has been assumed that the probability of multi-bit errors under these circumstances is remote. A Hamming Code can typically detect, but not correct, a two-bit error. However, it is very difficult to economically detect and correct a greater number of bit errors. It has heretofore been the accepted position that the probability of occurrence of multi-bit errors is so remote that it could be safely ignored.
With the introduction of much larger main memory capacities, such as several hundred megabytes for a single computer main memory, it has become necessary to utilize memory circuits which output multiple bits from each device, rather than only a single bit. Typical DRAM memory devices of this type can provide a four-bit output in parallel for memory device sizes such as one and four megabits. It is anticipated that such practice will continue, and possibly have even greater multi-bit outputs when memory circuits go up in capacity to 16 and 64 megabits. But, when a single memory device produces four bits in a single data word, there is a much greater likelihood of having a multi-bit error should there be a failure of that particular memory device.
Certain applications of computer systems have been developed which require an extremely high level of data integrity. The existence of undetected errors, even at the rate of only one or two per year, for such a computer system could result in serious consequences. In view of the greater likelihood of encountering multi-bit errors due to the use of multi-bit output memory devices and the increased criticality of data integrity, there exists a need for a method and apparatus to detect the occurrence of multi-bit errors for data which is stored in such a computer memory.