1. Field of the Invention
The present invention relates to a block code for efficiently correcting adjacent data and/or check bit errors. In particular, the present invention relates to a block code for avionics systems, where the block code corrects for adjacent or almost-adjacent errors that may be caused, for example, by the neutron single event upset (NSEU) problem that may occur at high altitudes.
2. Description of the Related Art
Error detection and correction techniques can provide efficient recovery of data that may have been corrupted for various reasons. For example, data can be corrupted by :perturbations in the communications channel through which the data was sent from a source to a destination, and by errors related to the storing of data in memories.
Error detection or identification can be performed in many ways, all of them involving some form of coding. See, for example, S. Lin and D. Costello, Jr., xe2x80x9cError Control Coding: Fundamentals and Applicationsxe2x80x9d, Prentice-Half, Inc., Englewood Cliffs, N.J., 1983 (Pages 1-5; 85-116). The simplest form of error detection is an added parity bit. Multiple parity bits cannot only detect that an error has occurred, but also which bits have been inverted. These bits can be re-inverted to restore the original data. Additional parity bits increase the chance that multiple errors will be detected and corrected.
Engineers and designers use error-detection and error-correction to provide reliable data transmission. Typical examples of such techniques are: parity checking, cyclic redundancy check (CRC), and Hamming codes.
Parity checking is a conventional error-detection technique. Each unit of data is written with an extra single xe2x80x9cparityxe2x80x9d bit. An even number of 1""s (or 0""s) are written in the data and parity bits, to achieve xe2x80x9cevenxe2x80x9d parity. Alternatively, an odd number of 1""s (or 0""s) are written in the data and parity bits, to achieve xe2x80x9coddxe2x80x9d parity. Parity checking provides a minimal overhead solution to provide error detection of semiconductor memory. There are many variations of this technique, e.g., even-parity, odd-parity, and horizontal-and-vertical parity checking. Parity checking does not provide error correction, however.
Check summing provides a unique value computed from a group of data. After, transmission of data, the check sum can be re-computed and compared to the transmitted, original check sum value. See, for example, D. P. Siewiorek and R. S. Swarz, xe2x80x9cThe Theory and Practice of Reliable System Designxe2x80x9d, Digital Press, 1982 (Pages 94-98). A mismatch indicates that an error has occurred in the transmitted data. Parity checking provides error detection of each data word. Instead of validating each individual data word, check sums only indicate an error in the entire transmitted or accessed data.
Another way to check for bit errors is CRC, which is more appropriate for burst errors. Circuitry (or software) cyclically shifts a code word to produce another code word. Devices to implement these codes are implemented using linear feedback shift registers (exclusive-or gates and memory elements). Cyclic codes detect all single errors in a code word, burst errors (multiple adjacent faults), and other error patterns. These codes are typically used in sequential-access devices and data transmission links. Details on cyclic codes may be found in the S. Lin et al. reference, discussed above.
Hamming is a well-known name in the field of data encoding and decoding. See, for example, the D. P. Siewiorek and Swarz reference pages 122-133. The Hamming code system provides an easy way to identify and correct errors in data bits. To encode binary data, redundant bits are added to produce a longer word. These check bits are combined with the data bits and are considered an integral part of the word. As such, either data bits or parity check bits, or both, can be in error in the transmission process.
The term xe2x80x9cHamming distancexe2x80x9d refers to the number of single bit errors possible between two codewords. To detect d bit errors, the code needs a d+1 Hamming distance. To correct d bit errors, the code needs a 2d+1 Hamming distance. A single parity bit has a Hamming distance of 2, and can therefore detect 1-bit errors.
FIG. 1a shows the parity-check matrix for a (7,3) Hamming Single Error Correcting (SEC) code, where the value 7 signifies the total code word size, and the value 3 signifies the number of check bits in the code word. This parity check matrix was originally proposed by Hamming in the 1950""s, as given in the above-mentioned article. Circuitry decodes a seven-bit word by forming the dot product with the matrix. FIG. 1b shows the resulting exclusive-or (XOR) parity-tree equations for computing the syndrome bits S0, S1, S2, and FIG. 1c shows the relationship of the check bits c0 through c2 and the data bits d0 through d3. With the seven-bit word, the matrix product of the (3xc3x977) block code matrix and the (7xc3x971) codeword produces a three-bit vector call the syndrome. If the syndrome contains all 0s, no detectable error is present. If a single bit error exists in the word of data, the syndrome value will match the check-matrix column corresponding to the bit in error.
The SEC Hamming matrix does not correct or detect multiple bit errors. As an example of a two-bit error, consider d1 and d2 being invalid. Using the equations of FIG. 1b, the syndrome is calculated to be equal to (1,1,0). This matches the third column of the matrix, and incorrectly indicates that d0 is invalid.
As is well known to those skilled in the art, semiconductor memories are being made with increasing memory densities, in order to obtain larger memory capacities using the same semiconductor substrate area. Innovations in semiconductor memory fabrication have allowed for such increased memory densities. While this is of course a good feature in general, those in the avionics and space industry have discovered a detrimental side effect to increasing memory densities. At high altitudes without a protective atmosphere layer, energetic particles cause discharges in semiconductor memories. See, for example, S. Buol, xe2x80x9cNeutron-Induced Single Event Upset or NSEUxe2x80x9d, Rockwell Collins Presentation, GenAv FCC Hardware Engineering, April, 1998; J. Olsen et al., xe2x80x9cNeutron-Induced Single Event. Upsets in Static RAMS Observed in 10 KM Flight Altitude, IEEE transactions on Nuclear Science, Vol. 40, no. 2, April, 1993. With decreasing lithography sizes, these discharges are causing multiple bit failures. These failures will increase with the introduction of denser commodity memory parts in avionics products.
Thus, there is a need for an efficient error detection and correction scheme. Further, there is a need for an error correction scheme that is not too complex and that can handle errors that occur due to NSEU events.
It is an object of the present invention to provide an error correction scheme that corrects physically localized errors like those that occur due to NSEU events, where those errors occur in adjacent or almost-adjacent locations of a code word.
The above-mentioned object and other advantages may be obtained by a first method of error correction. That method includes a step of providing a block code matrix such that an exclusive-or value of any: two adjacent columns of the block code: matrix and corresponding values representative of data disposed in each column are unique with respect to each other. That method also includes a step of computing a syndrome of a codeword using the block code matrix, and correcting for any single errors and any double-adjacent errors based on the computed syndrome.
The above-mentioned objects and other advantages may also be obtained by a second method of error correction. That method includes a step of providing a block code matrix wherein an exclusive-or value of any two adjacent columns of the block code matrix, an exclusive-or value of any three-adjacent columns, and a value corresponding to a numeric representation of each column are unique with respect to each other. That method also includes a step of computing a syndrome of a codeword using the block code matrix, and correcting for any single errors, any double-adjacent errors, and any triple-adjacent errors based on the computed syndrome.
The above-mentioned objects and other advantages may also be achieved by a program configured to be run on a processor and configured to provide error correction. The program includes a program code unit for computing a block code matrix of a predetermined number of columns and a predetermined number of rows, such that an array of bits in each of the columns is unique: :with respect to other columns, and such that an exclusive-or value of corresponding bits in each adjacent column is unique with respect to each other and with respect to each of the columns.