1. Field of the Invention
This invention relates to error detection methods for memory arrays and, more specifically, to a system for parity-based detection of even numbers and odd numbers of address bit errors and memory data errors.
2. Discussion of the Related Art
As data is transferred through a data processing machine, hardware failures, including device failures and connection failures, can cause data errors. For instance, where a single wire connection fails in a 16-bit data bus, the data passed over the bus will have a single bit error from the failed connection about half of the time. Also, in a storage device, failure of one or more bits can introduce data errors when data is first written into the storage device and then read from the storage device. Again, lock-up of a single bit introduces data errors about half of the time, reducing error detection probability accordingly.
A common method for detecting data errors involves computation of the parity of a data word. This single parity bit is then transmitted and stored together with the associated word. When the word is received, the parity is recomputed and compared with the accompanying parity bit. Parity disagreement alerts the data processing system to the presence of an odd number of bit errors. Unfortunately, simple parity-check error detection cannot detect even numbers of bit errors.
Practitioners in the art have proposed various means for improving error detection and error correction over the simple parity-check model. The effective ones of these methods tend to be expensive and complex.
In U.S. Pat. No. 4,799,222, George J. Barlow et al disclose a method for transforming address words and transferring addresses containing "integrity bits". Barlow et al teach a method and apparatus for verifying that address incrementing and transfer is performed without error by logically combining the incremented address with the transformed bits and integrity bits of the unincremented address. Their technique cannot be implemented using a single parity bit extension to the address word.
In U.S. Pat. No. 4,785,452, Bruce R. Petz et al disclose an error detection scheme using variable field parity checking that relies on availability of reserved bits within control words. These variable numbers of extra error correction code bits are used to increase error detection capability in a control store for words having such extra bits available. The technique relies on availability of extra unused bits appended to the data word.
Other practitioners combine error detection with error correction and have long sought simple, inexpensive means for performing both. For instance, U.S. Pat. No. 4,417,339 issued to Robert G. Cantarella discloses a fault-tolerant correction circuit that relies on a modified Hamming code. Cantarella's method corrects single bit errors and detects double bit errors and tolerates single parity-check subcircuit failures. Cantarella requires a number of "syndrome" bits to ensure fault-free parity subcircuit function.
U.S. Pat. No. 4,651,321 issued to Gary A. Woffinden et al discloses a method for reducing storage space requirements for error detection and correction in memory. Woffinden et al create an error checking and correcting code that incorporates the parity bits normally stored with the data word together with additional ECC bits. Their method requires more ECC bits than are normally expected in the art.
U.S. Pat. No. 4,345,328 issued to Gary D. White discloses a method for single-bit error correction and double-bit error detection using through-checking parity bits. The parity bits are appended to each byte as check bits. Additional check bits are required to perform single-bit error correction. Thus, White's method requires a significant plurality of extra bits for each data word.
T. L. Mann ("Error Correction Codes with Address Checking", IBM Technical Disclosure Bulletin, Vol. 32, No. 1, pp. 375-377, June 1989) suggests using residual ECC bits to protect address words from undetected errors, thus providing some assurance that the retrieved data word comes from the desired storage device location. Mann teaches the incorporation of ECC in both the data word and the address word but his Hamming code error correction technique requires a substantial number of additional bits.
G. N. Martin et al ("Preventing Address Misrecognition", IBM Technical Disclosure Bulletin, Vol. 27, No. 8, p. 4965, January 1985) suggest a simple technique for adding address redundancy to protect against address errors in storage devices using removable storage media. Martin et al suggest verifying an address identifier before either a read or write operation. They do not provide implementation details for their suggestion nor do they suggest how to overcome the many associated problems.
Thus, despite many efforts by practitioners in the art, until now there has been no simple method known for error detection in both address and data words in a storage device memory controller. A suitable method must consider the connections between a memory controller module and memory modules. These include the data bus, the address bus, the memory parity line, the write enable line and the output enable line. A memory parity bit is normally generated during the write operation and checked following the read operation within the memory controller. Detection of address bus errors and data bus errors requires a memory parity bit to be generated by a device having inputs that are identical to the address bus and data bus off-chip drivers in the memory controller module. Such a detection scheme works well only for odd numbers of address line errors. Even numbers of address bus bit errors are not detected in the present art.
The problem involving failure to detect address bus bit errors is exacerbated for the newer Dynamic Random Access Memory (DRAM) technology because the DRAM address bus carries the higher order address bits (row address) and the lower order address bits (column address) at two separate times. A single address bus fault may now appear as a double-bit error to the address bus.
Another potential problem is related to the data allocation map used to track data storage locations within a memory system. A mismatch between a record identification number and the memory address stored in the allocation map can retrieve the wrong data word without notice. For example, suppose that record identification number A is assigned at memory address B in the data allocation map when the write (store) operation occurred. Subsequent misassignment of record identification number A to memory address X in the data allocation map retrieves the wrong data in a read operation. Even data in a memory system that is protected by some ECC/CRC technique can not detect the data allocation map errors. Such an error is denominated herein as a global data error.
These unresolved problems and deficiencies are clearly felt in the art and are solved by this invention in the manner described below.