This invention relates to error detection and correction of stored data in general and more particularly to the correction of long burst errors or missing data, also known as erasure correction.
The use of increasingly higher density storage media in digital computer systems has caused an increase in the potential for defect-related data errors. To reduce data loss as a result of such data corruption, error correction codes are employed to correct the erroneous data.
Before a string of data symbols is recorded on magnetic tape, it is mathematically encoded to form redundancy symbols. The redundancy symbols are then appended to the data string to form code words, data symbols plus redundancy symbols. The code words are stored on magnetic tape. When the stored data is to be accessed from the magnetic tape, the code words containing the data symbols are retrieved from the tape and mathematically decoded. During decoding, any errors in the data are detected and, if possible, corrected through manipulation of the redundancy symbols.
Stored digital data can contain multiple independent errors. A type of error correction code used for the correction of multiple errors is a Reed-Solomon (RS) code. To correct multiple errors as strings of data symbols, RS codes utilize various mathematical properties of sets of symbols known as Galois Fields, represented by GF(P**Q), where xe2x80x9cPxe2x80x9d is a prime number and xe2x80x9cQxe2x80x9d represents the number of digits, base P, in each element or symbol in the field. xe2x80x9cPxe2x80x9d usually has the value of 2 in digital computer applications and, xe2x80x9cQxe2x80x9d is the number of bits in each symbol.
Data is typically stored on a magnetic tape in a long sequence of symbols. Errors in data stored on magnetic tape often occur in long bursts, that is many erroneous symbols in sequence. Special error detection and/or correction techniques are typically employed to handle these long burst errors.
Once a long burst error is detected the erroneous symbols involved are corrected, if possible. The faster the errors can be corrected the faster the data can be made available to a user. Thus, the effective data transfer rate increases as the speed of error correction increases.
In an aspect the invention features, a method of erasure correction for an error correction code (ECC) entity including receiving an erasure location, generating a syndrome polynomial, computing an erasure constant array and correcting a first row ECC data value from the syndrome polynomial and the erasure constant array.
Embodiments of the invention may have one or more of the following advantages.
The invention relates to a class of ECC codes in which the data is encoded twice, once in the vertical dimension (column) and once in the horizontal dimension (row).
Data is generally written to a medium, such as magnetic tape, in blocks. A block is a linear address space containing typically four thousand (4 k) bytes. Thus, each block may contain up to four thousand alphanumeric characters, one alphanumeric character per byte. Each block terminates with the addition of redundancy bytes that are used as a data checker and/or correction. These redundancy bytes are referred to as column redundancy bytes. For example, the contents of the last eight bytes of a block may be generated in a linear shift feedback mechanism of all previous data written within the block. This produces a checksum or cyclical redundancy check (CRC) code for all data written into the block.
For each sixteen blocks of data, a row error correction code (ECC) entity is generated and used to correct errors that are identified by the column redundancy code. An ECC entity typically includes four blocks, each block containing 4 k bytes. Each block in the ECC entity is referred to as a column. Each column contains 4 k bytes, with each byte representing a symbol in the row code. Each row in the ECC entity represents a code word (or ECC data symbol) that is a mathematical representation of the data contained in one of the sixteen previous blocks of data. Each row of the ECC entity terminates in an error correction code known as an ECC row redundancy code. The ECC row redundancy code provides a method of recovering data symbol errors contained within the rows of the ECC entity.
During a read operation, data is read from the sixteen blocks of data along with their associated row ECC data symbols which are contained in the ECC entity. An entire column (block) of data symbols in an ECC entity will be erased if the system determines that there are un-correctable errors in the column. The error has occurred in a known location with respect to the row code. To correct the erasure, an error locator polynomial is generated, a syndrome polynomial is generated, and the polynomials are seeded to determine the error value (i.e., the correct data), for each of the four thousand rows of symbols in the ECC entity.
The fact that the erasure locations are identical for every row in an ECC entity is used to pre-compute a constant array. This constant array depends only on the erasure locations so that the constant array may be shared by every row in the ECC entity and calculated only once. For each row, a syndrome polynomial is generated. Using the syndrome polynomial in combination with the pre-computed constant array generates the row error value without the need of generating the same constant array from the error locator polynomial and syndrome polynomial for each row in the ECC entity.