In digital data systems, it is a common problem for important data to become corrupted by data errors. Data stored on a data storage device, for example, is subject to errors as a result of surface defects or of imperfect tracking between the recording head and a "track" containing the data. Similarly, data that is transmitted from a sender to a receiver on a network may become corrupted by noise. While digital data systems are designed to eliminate sources of errors and to reduce their effects, nevertheless data errors still occur. It is desirable to be able to recover data despite the presence of data errors.
In order to enhance data integrity in digital data systems, such systems commonly employ error correction coding techniques. The use of such techniques enables digital data systems to recover data correctly despite the presence of errors. Using error correction coding, a data system encodes a piece of data into a codeword which typically consists of the original piece of data and some check data. The check data is generated from the original data according to an error-correcting code (ECC). The decoder for the ECC is capable of decoding the codeword to obtain the original data even if some of the data or check symbols are in error. The decoder can distinguish codewords despite errors because the codewords generated by the ECC are sufficiently different from each other. Such a decoding process is similar to a human's ability to correct spelling errors as he or she reads, because words are sufficiently different from each other that he or she "knows" which word is meant.
There are several considerations that determine the error-control strategy to be used in a system. The ECC should be as simple as possible, so that the cost and complexity of the decoder is minimized. The ECC must also have enough correction power so that data integrity is maintained in the system. For a code of given complexity, the number of errors the code can correct is approximately proportional to the number of check symbols in its codewords. High data integrity therefore implies large codewords. However, the ECC should not require an excessive amount of precious storage space or network bandwidth for the necessary check symbols.
In order to maintain high data integrity, a system would employ an ECC that can correct the maximum number of errors that may be encountered on a piece of data. However, this maximum number is typically far greater than the average number of such errors, and therefore the average codeword used in such a system wastes a substantial number of its check symbols. It is highly desirable to employ an error correction technique that obtains high data integrity while minimizing the size of the codewords employed, or, alternatively, maximizes data integrity for a given-size codeword.