1. Field of the Invention
The invention relates generally to forward error correction of data in digital communications using algebraic coding over finite fields, and particularly to an apparatus and method to reliably extend error detection and correction coding for communicated data using large finite fields.
2. Related Prior Art
A systematic (n, k) error correction code appends r redundant symbols to k data symbols to provide a sequence of (n=k+r) symbols, the r redundant symbols determined by an encoding operation. The ratio k/n is known as the code rate, and the entire sequence of n symbols is known as a codeword. A common type of error correction code is known as a Reed Solomon code. See Polynomial Codes over Certain Finite Fields, Irving S. Reed and Gustave Solomon, Journal of the Society for Industrial and Applied Mathematics (SIAM) 8 (2): pp. 300-304 (1960). A Reed Solomon code may associate a sequence of n codeword symbols with coefficients of a codeword polynomial, and an encoder provides a codeword polynomial that is a multiple of a code generating polynomial. The number of bits per symbol, m, is typically in the range from 8 to 16. Reed Solomon coding operations with m-bit symbols use the addition, multiplication, and division of a finite field with 2m elements, a finite field (or Galois field) known as GF(2m).
When the codeword sequence of symbols is transmitted (or stored), one or more symbols received (or recovered from storage) may be corrupted. A first form of corruption known as a symbol erasure occurs when a receiver indicates that a particular symbol in the sequence of received symbols is likely to have an erroneous value. To correct a symbol erasure, a decoder must determine the correct symbol value at the indicated symbol location. A second form of corruption known as a symbol error occurs when a receiver estimates an erroneous value for a particular symbol without indicating a likelihood of error. To correct the symbol error, a decoder must find the location of the erroneous symbol in the sequence of received symbols and determine the correct value of the erroneous symbol.
Error correction codes may also be cross-interleaved to improve correction capability, particularly with bursty errors. See, e.g., U.S. Pat. No. 4,413,340, Error correctable data transmission method, K. Odaka, Y. Sako, I. Iwamoto, T. Doi, and L. B. Vries (1980). In a Compact Disc data storage standard, each symbol is contained in two cross-interleaved Reed Solomon codewords. See Reed-Solomon Code and the Compact Disc, K.A.S. Immink, in Reed-Solomon Codes and Their Applications, S. B. Wicker and V. K Bhargava, Editors, IEEE (1994). In the CD standard, the data can be thought of as a two-dimensional array, with rows of the array encoded to be row codewords, and columns of the array encoded to be column codewords. A column codeword is a (32, 28) code, which may be partially decoded in a prior art decoder in a first step of a prior art two-step decoding algorithm. In the first step, a typical CD decoder processes the (32, 28) column codewords, decoding to correct any single error or indicate error detection to a second step decoder. The second step uses the column error detection indications and four redundant symbols in row codewords to reliably correct s symbol erasures and t symbol errors provided that (s+2t≦4) in all rows.
A known property of Reed Solomon codes is that two different codewords with r symbols of redundancy differ in at least (r+1) symbols. If a limited number or erasures and errors occur in a received sequence of n symbols corresponding to a transmitted codeword, a decoder is able to resolve the corruptions and correctly estimate the transmitted codeword. In general, if there are s symbol erasures and t symbol errors in a Reed Solomon codeword, a prior art decoder can determine the correct codeword whenever (s+2t≦r).
When (s+2t>r), a prior art decoder typically fails to decode the codeword properly. Two kinds of decoder failure are of primary concern. An uncorrectable error is a first kind of decoder failure where a decoder signals that something occurred in decoding indicating that the correct codeword cannot be determined with certainty. In this case, the decoder typically provides an uncorrectable indicator to accompany the estimated codeword in further processing. In cross-interleaved Reed-Solomon codes, for example, a first step column decoder operating on a column codeword may provide an uncorrectable indicator which becomes a useful erasure locator for a second step row decoder operating on a row codeword.
Misdecoding is a second kind of decoder failure where one or more decoded symbols in the codeword are incorrect but the decoder does not detect and/or indicate correction uncertainty. Typically, decoder systems are required to be highly reliable, in that the probability of misdecoding is required to be very small. A higher probability of uncorrectable error is typically allowed, in part because the codeword data may be recoverable through a retransmission or storage recovery routine, if only it is known to be in error.
For example, if the number of erasures, s, is equal to the number of redundant symbols, r, a prior art decoder may always determine an error value at each erased location. However, if any other received symbols in the codeword are incorrect, the decoder misdecodes. This high misdecoding rate is typically unacceptable, and error correction may therefore be limited to the case when s<r.
An example prior art error correction code introduced in the IBM 3370 magnetic disk drive used a Reed Solomon code with three redundant eight-bit symbols per codeword, each codeword contained approximately 170 symbols of data. Three interleaved codewords provided error correction coding for each 512-byte sector of stored data. The prior art decoder could correct any codeword with a single symbol error, and detect any codeword with two symbol errors as uncorrectable. See Practical Error Correction Design for Engineers, Revised Second Edition, Neal Glover and Trent Dudley, Cirrus Logic, Bloomfield, Colo. (1991), ISBN 0-927239-00-0, pp. 274-275. The prior art decoder operating on a corrupted (173, 170) codeword with three or more symbol errors either indicated an uncorrectable codeword or misdecoded, depending on the error pattern.
An efficient method of decoding Reed-Solomon codewords with errors is known as the Berlekamp-Massey Algorithm (BMA). See Berlekamp, E. R., Algebraic Coding Theory, Revised 1984 Ed., Aegean Park Press, Laguna Hills, Calif. (1984) ISBN 0-89412-063-8, pp. 176-189, and Massey, J. L., Shift Register Synthesis and BCH Decoding in IEEE Trans Info. Theory. IT-15 (1969), pp. 122-127. In a typical structure for a BMA decoder, a first step (or unit) determines a plurality of weighted sums from an estimated codeword, the weighted sums known as syndromes. A nonzero syndrome indicates a codeword with errors. A second step (or unit) determines a polynomial known as an error locator polynomial. A third step searches to find one or more roots of the locator polynomial. At each root of the locator polynomial found in the search, the decoder determines a correct value for an erroneous symbol at a corresponding location in the codeword. When the locator polynomial is of degree two, the search to find the roots of the locator polynomial may be replaced by a direct solution using an algebraic transformation and log and antilog tables for a finite field. See Practical Error Correction Design for Engineers, Revised Second Edition, Neal Glover and Trent Dudley, Cirrus Logic, Bloomfield, Colo. (1991), ISBN 0-927239-00-0, pp. 152-156.
In an extension of the Berlekamp-Massey Algorithm, the second step of the BMA decoding method described above is modified to provide an error-and-erasure locator polynomial. The modified Berlekamp-Massey Algorithm (MBMA) can be used to correct both errors and erasures in a corrupted codeword. See Blahut, R. E., Theory and Practice of Error Control Codes, Addison-Wesley ISBN 0-201-10102-5 (1983), pp. 256-260.
A known limitation of Reed Solomon coding over a finite field GF(2m) is that the total number of bits per codeword, B, is approximately limited to (B<m 2m). Because the number of bits per symbol is typically limited to the range from eight to sixteen, the growth of coding symbol sizes (and block transfer sizes) has not kept pace with the growth of typical bus sizes (and transfers) of modern computer systems, now typically at 32 or 64 bits per bus symbol with storage block and memory page sizes starting at 32K bits. Larger bus and block sizes are desired to support higher system throughput and larger data records. Although traditional Reed Solomon coding is possible in larger finite fields, the complexity of the required components tends to grow exponentially while throughput grows linearly.
Improved error correction codes and decoding methods are desired to overcome limitations of the prior art. In particular, improved coding methods are desired for codewords with larger code symbols from larger finite fields, preferably retaining the simplicity of coding operations from smaller subfields. In addition, improved coding methods are desired which provide higher throughput in coding operations, better protection against misdecoding, and more correction power for a given code rate.
U.S. application Ser. No. 13/541,739, Construction Methods for Finite Fields with Split-Optimal Multipliers (2012), teaches improved construction methods for finite fields and, in particular, for large finite fields with a large number of bits per symbol. Methods of coding for large finite fields are desired which achieve the benefits of higher system throughput using large symbols, but without the exponential growth in implementation complexity. U.S. application Ser. No. 13/932,524, Improved Error Correction using Large Fields (2013), specifies a general method for error correction using large finite fields and apparatus for correcting a small number of errors. Improvements are desired in an apparatus and a method of decoding for three or more errors in an error correction system using large symbols.