In the field of digital data storage, data reliability is critical. Specifically, it is important that the user data that are retrieved from a medium match the data that were written to and stored on the medium. For a variety of reasons, the retrieved data may differ from the data that were originally stored. Any differences between the stored data and the retrieved data are considered errors in the data. Traditional methods for ensuring data reliability have included error detection and error correction. Typical error detection and correction techniques involve appending parity bits to the user data during an encoding process to form a code word prior to storage. When the code word (user data with parity bits) is later retrieved from the medium, it is decoded, whereby the parity bits are used to detect and correct errors. Essentially, the parity symbols provide redundancy, which may be used to check that the data were read correctly from the medium.
Digital data is typically partitioned into a number of symbols, each consisting of a fixed number of bits. For example, in the field of data storage, 8-bit symbols or “bytes” are commonly used. An h-bit symbol may be viewed as an element of the Galois Field GF(2h), which is a finite field having unique mathematical properties. By treating the data as Galois field elements, mathematical operations may be performed on the symbols in a data storage device to reach useful results, including checking for errors. Error detection and correction algorithms, such as those used with the well-known Reed-Solomon (RS) codes, take advantage of the mathematical properties of Galois Fields. An error correction algorithm is able to correct up to a maximum number of symbol errors. The maximum number of symbol errors that the algorithm can correct is referred to as the “correction power” of the code. Error correction algorithms are able to correct errors primarily because a limited number of data blocks constitute the valid code words that may be stored on the medium.
Typically, before user data is stored, it is first encoded with parity symbols for the sole purpose of error detection. These parity symbols are computed from the user data and the block of data consisting of the user data and the parity symbols forms a code word in an error detection code (EDC). The parity symbols will be referred to as EDC parity and the block of data together with its EDC parity will be referred to as an EDC codeword. (For many classes of codes, such as the RS codes, the code symbols are viewed as elements of a Galois field and the code word is viewed as a polynomial whose coefficients are those Galois field elements. The defining property of the code is that certain values of these polynomials are equal to zero. These codes are called “polynomial codes”.)
In addition, the user data and EDC parity (EDC codeword) may be encoded with additional parity symbols for the purpose of error correction. These parity symbols are computed from the user data and EDC parity and the block of data consisting of the user data, the EDC parity, and the additional parity symbols form a code word in an error correction code (ECC). The additional parity symbols will be referred to as ECC parity. The entire block of data together with its EDC parity and ECC parity will be referred to as an ECC codeword. During decoding, while the erroneous data is retrieved, two sets of “syndromes”, one associated with the EDC and the other associated with the ECC, are computed. The two sets of syndromes are referred to as the EDC syndromes and the ECC syndromes. In the case of polynomial codes, these syndromes are the polynomial values used to define the codes, which are equal to zero when the data block constitutes a valid codeword. Thus, if the EDC syndromes are non-zero, i.e. if any bit in any syndrome is 1, an error has occurred in the EDC codeword. Similarly if the ECC syndromes are non-zero, an error has occurred in the ECC codeword. Furthermore, if an error is identified by the ECC syndromes, an error correction algorithm may then use the ECC syndromes to attempt to correct the error.
A typical error correction algorithm applies a minimum distance rule, in which a block of data containing errors (an “invalid” or “corrupted” codeword) is changed to the “closest” valid codeword. The “distance” between two blocks of data is the number of symbols in which the blocks differ, so that the closest codeword to a block of data is the codeword which differs from that block in as few symbols as possible. This is referred to as “maximum likelihood decoding” because an error event in which a small number of symbols are corrupted is generally more likely to occur than an event in which a large number of symbols are corrupted. However, in some cases, for example when massive data corruption occurs, the closest codeword to a corrupted codeword may not be the codeword originally written to the storage medium. In this instance, the algorithm will still “correct” the codeword to the closest valid codeword. Such a “miscorrection” results in an undetected corruption of user data. Clearly, a robust error control system must include mechanisms that guard against miscorrections. One such mechanism is contained in the error correction algorithm itself: generally when an error event beyond the correction power of a code occurs, then with high probability the algorithm will detect that the errors are uncorrectable. A second mechanism against miscorrection is the EDC. If the error correction algorithm has restored a corrupted codeword to the codeword originally written to the medium, then in particular the user data and EDC parity symbols have been restored. Thus, if the EDC syndromes are recomputed from the corrected data, they will all be zero. If a (possibly erroneous) correction has been performed and the recomputed EDC syndromes are not all zero, then the EDC has detected a miscorrection. In this way, the EDC reduces the likelihood of undetected data corruption even further.
Current approaches to decoding data involve implementations of Horner's algorithm, a key equation solver, a Chien search, and Forney's algorithm. Horner's algorithm is a method for evalutating polynomials and is used to compute the polynomial values that comprise the EDC and ECC syndromes. An error locator polynomial and an error evaluator polynomial are computed from the ECC syndromes by a key equation solver, such the Berlekamp-Massey algorithm. The roots of the error locator polynomial are Galois field elements, which correspond to locations of errors in the ECC codeword. The roots of the polynomial, and hence the locations of the errors, can be computed by a systematic search called a Chien search. Once an error location has been identified, the “error value” needed to correct the error can be computed by Forney's algorithm. It is desirable to use Horner's algorithm to recompute EDC syndromes after error correction has been performed, but current approaches have a number of drawbacks.
One approach is to store the corrupted user data and EDC parity symbols in a buffer, perform corrections on the data in the buffer, and then recompute the EDC syndromes as the corrected data are read from the buffer. In this case all the EDC syndromes should be zero. The drawback to this approach is the latency it adds to the system. Syndrome computation will not be complete until the last code symbol has been read from the buffer and at that point user data may already have been transferred to the host computer system. The host system must then be informed to disregard the data it has received. This is undesirable even though the storage device will most likely recover the data through a retry methodology such as rereading the data from the storage medium. To avoid additional system latency, it is desirable to recompute the EDC syndromes in parallel with the Chien search, so that the EDC will detect an ECC miscorrection before any user data have been transferred to the host. This approach is facilitated by the fact that the EDC syndromes can be computed from the error locations and values alone, instead of from the entire block of corrupted data. In this methodology, the EDC syndromes computed from the correction pattern are compared with the syndromes that were computed when the corrupted data were originally read from the medium. If the two sets of syndromes do not match, then the correction pattern does not match the actual error pattern and thus a mis-correction has been detected.
While it would be desirable to use Horner's algorithm to recompute the EDC syndromes from the error pattern, a difficulty arises from the order in which Horner's algorithm and the Chien search process codeword symbols. When a codeword is treated as a polynomial with Galois field coefficients, the coefficients of highest order are the first to be written to or read from the medium. This enables the use of Horner's algorithm, which processes the polynomial coefficients in descending order. However, the simplest implementation of a Chien search processes the locations corresponding to polynomial terms in ascending order, which is the “wrong direction” for Horner evaluation. The EDC syndromes can be computed from the error pattern by other methods, which require hardware more expensive than that for Horner evaluation.
It is with respect to these and other considerations that the present invention has been developed.