Several commercially available data error correction and detection devices have been developed to protect 32-bit data paths. Exemplary thereof is a device manufactured by Texas Instruments Inc. designated as a TI 74AS634, and a device manufactured by American Microdevices, Inc. designated as an AMD 2960. These prior art devices append seven parity bits to each 32-bit datum to provide single-bit error correction and double-bit error detection capabilities (SEC/DED). The combination of data and parity forms a (39,32) systematic code where 39 bits is a total width required for the expanded data path. This type of protection provides satisfactory coverage for independently occurring random bit errors. However, faults in digital systems can overwhelm this protection level. For example, for an integrated circuit chip failure, sometimes referred to as a "chip kill", the number of errors may involve all of the data lines passing through the particular failed chip.
Known in the art of error detection and correction are Reed-Solomon codes. These codes are an efficient class of linear codes using multi-bit symbols that are maximum distance separable. Binary-based Reed-Solomon codes use symbols from a finite field of 2.sup.m, generally labeled by GF(2.sup.m) where m bits represent a field element. Two sizes of parameter m are of practical interest for present technology: m = 4, representing nibble symbols and m = 8 for byte symbols.
In U.S. Pat. No. 4,476,562, Oct. 9, 1984, Sako et al disclose correcting symbols in an audio system where serial data symbols are being processed. A Reed-Solomon code of generic multi-bit symbols (m&gt;2) is employed having a minimum symbol distance of five symbols, permitting double error correction. Interleaving of bits is employed in one embodiment. The system of Sako et al. processes a string or series of data values and not a single 32-bit data value.
In U.S. Pat. No. 4,637,021, Jan. 13, 1987, Shenton discloses a system that uses two levels of Reed-Solomon codes, interleaving their bit positions to obtain a single byte error-correcting code. The technique requires two decoding operations and employs four check bytes. This system is intended for serial data and has a substantial delay due to de-interleaving and two level decoding.
In U.S. Pat. No. 4,683,572, Jul. 28, 1987, Baggen et al. disclose two Reed-Solomon codes, each having a minimum symbol distance of five and capable of correcting two symbol errors. The codes are interleaved to protect optical disk data having a serial format. This coding system uses soft decision flag information in the decoding process as opposed to hard decision information. Two bits of additional information are attached to each eight bit symbol and indicate to the decoder a relative confidence in the symbol being correct. Each basic code is a shortened byte correcting Reed-Solomon code using four check symbols. The equivalent binary view of each code is 128 bits with 96 information bits.
In U.S. Pat. No. 4,730,321, Mar. 8, 1988, Machado discloses a decoder implemented in a dedicated microprocessor. An associated algorithm employs three shortened Reed-Solomon codes with byte-wide symbols. Two different shortened versions of a byte protecting Reed-Solomon code are involved. Each code has four parity symbols and can correct two symbol errors. Interleaving the three codes is employed in protecting data originating from a rotating disk storage system. The implementation emphasizes simple syndrome calculation circuitry which is coupled to the microprocessor. However, the decoder operates sequentially because of the microprocessor-based decoder. The overall code length after interleaving is 524 symbols.
In U.S. Pat. No. 4,782,490, Nov. 1, 1988, Tenengolts discloses two Reed-Solomon codes employed with interleaving where each code is byte-symbol based. The role of one of the codes is verification of correction performed by the other. The other code is double-byte-correcting and operates on incoming serial data. The basic block size of the data can be changed. Even though the one code is capable of double symbol correction, only single byte error correction is used.
In U.S. Pat. No. 4,868,827, Sep. 19, 1989, Yamada et al. describe a general byte correcting Reed-Solomon code for PCM communication data. The code employed is shortened to 61 symbols which, when viewed over the binary field, is quite long (488 bits). The system handles data serially and employs a standard decoding method. This system is capable of correcting two byte errors in 61 symbols.
In U.S Pat. No. 4,633,470, Dec. 30, 1986, Welch et al. present an advanced theory of decoding serial data protected by a Reed-Solomon code and describe a method for decoding general Reed-Solomon codes over any field without explicitly calculating the syndromes. Serial data is processed by an iterative algorithm having a variable delay. A significant portion of the decoding algorithm implements the Lagrange interpolation formula from classic mathematical theory. Furthermore, this system employs an iterative process and not a direct calculation of values.
In U.S. Pat. No. 4,371,390, Feb. 1, 1983, D. R. Kim discloses the logging of permanent errors in a memory system for correcting single bit errors. This system operates on parallel bits, not on symbols in parallel.
In a journal article entitled "A 10 MHz (255,223) Reed-Solomon Decoder", Proc. IEEE 1988 Custom Integrated Circuits Conference, paper 17.6, May 16-19, 1988, Demassieux et al. describe an implementation of a 16 symbol error-correcting decoder for byte-width symbols. The system treats the data serially. The decoder uses standard sequential Euclidian algorithm techniques to find error location.
All of the above described prior art processes a string or series of data values, and not a single 32-bit data value. These prior art systems also do not encode parity, are unidirectional, and are not internally fault tolerant.
Many of these references deal with the serial use of Reed-Solomon codes and do not have a parallel error-correcting feature wherein all data and parity lines are sensed simultaneously and wherein any error is corrected immediately. As such, these references experience a variable decoding delay due to the sequential nature of their underlying algorithms.
Furthermore, the number of parity positions required by these references is substantial. While additional parity positions may increase the error-correcting or error-detecting capabilities for longer serial data strings, for a 32-bit datum a shortened code that meets the maximum error performance bound for linear codes is preferable.
It is therefore one object of the invention to provide an error detection and correction integrated circuit device that employs a relatively short parity code, as compared to the prior art, while meeting a maximum error performance bound for linear codes for a 32-bit datum.
Also, none of these references teach systems that are internally fault-tolerant. This important attribute ensures that any single subsystem failure in the encoding/decoding device is signalled externally so as to prevent the processing of erroneous data.
It is therefore another object of the invention to provide an error detection and correction integrated circuit device that is internally fault tolerant.