1. Field of the Invention
This invention relates in general to error correction codes, and more particularly to a method, apparatus and program storage device for correcting a burst of errors together with a random error using cyclic or shortened cyclic codes.
2. Description of Related Art
In current magnetic recording systems, read and write operations are made with respect to addressable ECC-coded sectors of tracks stored eventually on high-density direct access storage devices (DASD). DASDs or disk drives include at least one rotating disk covered with a magnetic coating that can store magnetic or electronic data and an apparatus for reading data from and writing data to that disk. This is implemented by a “spindle motor” to rotate the disk or disks, at least one “read/write head” to read and write data to and from the disk or disks, an “actuator” to position the read/write head or heads radially over the disk or disks either on a linear or rotary basis, and a “data channel” to transfer information between the read/write head or heads and an accessing source.
Information is recorded on the disk along concentric tracks divided into sectors. Years ago, all the tracks had the same number of bytes recorded thereon. This meant that the recording density per track varied inversely with radial distance. Today, the recording practice and the capacity have changed such that groups of concentric tracks form a zone and have data recorded at a density (bytes/track inch) that is constant. Thus, the tracks in the outer zones will have more information recorded thereon than those on the inner zones.
The sectors on each track are each operable on a unit of addressable storage. Generally in the industry, each addressable sector consists of 512 bytes. Each sector includes redundant bytes that aid in the detection and correction of errors up to some fixed limit. Whenever an error or erasure exceeds the capacity of the sector level code, then additional measures are needed to recover.
It is also desirable to superimpose additional associations among the units of storage, either to assist in rapid accessing or to enhance the active or passive protection of the data. These additional associations are termed collectively as “logical views” of storage. One construct can be formed from a logical association of an arbitrary set of n same-size storage units and a redundancy unit derived therefrom. In the case of error or erasure beyond the ECC capacity for that sector, data on that sector would be unavailable. However, it can be reconstructed by logically combining the remaining n-1 sectors with the redundant sector of that group.
Block or cyclic codes have long been used for detecting and correcting multiple bits or bytes in error in long bit or byte strings read back from a cyclic, concentric, tracked storage medium such as a magnetic disk storage subsystem or the like. Typically, each bit or byte string of predetermined length is treated as if it were an algebraic polynomial and subject to modulo division by an encoding polynomial. If the code is denominated as being “systematic”, then redundant bits or bytes derived from the data are appended to the data string which otherwise remains intact.
In the case of linear block cyclic or shortened cyclic codes, the remainder is generally appended to the end of the data bit or byte string, although in certain implementations it may be convenient to append it at the beginning instead. Each data bit or byte string plus the appended remainder is then recorded on a storage medium or transmitted. Subsequently, when the data is accessed and played back from the medium, a remainder is in principle recalculated from the datastream as it is extracted and compared with the recorded remainder. If the remainder values comparison match, the difference result is zero. If the results do not match (nonzero difference), then this is indicative of error. Codes are now quite advanced such that the remainders are processed not only for identifying the presence of errors, but also for pinpointing its location and determining the correction values to be applied to the datastream. This is referred to as syndrome processing. Codes used for error correction are called error-correcting codes (ECC).
A code, C, is said to be a linear cyclic code if the cyclic shift of each codeword is also a codeword. If each codeword u in C is of length n, then the cyclic shift π(u) of u is the word of length n obtained from u by shifting the last digit of u and moving it to the beginning, all other digits moving one position to the right. Reed-Solomon (RS) codes are the main example of linear cyclic ECC codes based on bytes. They are used extensively in magnetic recording and communications. One advantage of RS codes is that they maintain maximum distance among codewords for any given length of data. This “spacing” between permissible codewords renders them useful for detecting and correcting randomly occurring byte errors as well as burst errors over a run of contiguous bytes.
When the data is read from any storage system, the data bytes are subject to error and erasure from random, intermittent, and recurrent sources. These may be due to media defects, signal coupling between tracks, extraneous signals induced in the readback path, etc. It is generally desired to correct the errors in place. This means that an array is read from the medium and written into a sufficiently sized buffer or RAM and memory local to the storage subsystem.
Error correcting codes have thus been used to correct such errors. However, correcting bursts of errors is a difficult problem. Even for correction of one burst, the best codes are found by extensive computer search, although there are analytical constructions like Fire codes. A Fire code is a conventional linear binary block code, i.e., it consists in transmitting, in addition to k information bits, a number or redundant bits computed by exclusive-or manipulations on the information bits. For example, the original k information bits can be used to build a polynomial, whose coefficients are the bits of the sequence. The redundancy can be expressed as the coefficients of another polynomial obtained as the remainder of the division of the polynomial representing the sequence by a pre-defined polynomial, characteristic of the code and called the generator polynomial. Thus, a Fire code has a generator polynomial designed to allow good detecting and/or correcting performance when errors happen in bursts.
Nevertheless, correcting just one burst may be a problem when an extra random error occurs in addition to a burst. Data error in the storage context means any change in the stored value as a consequence of either random noise or a burst. In systems storing binary values such as 1 1 1 0 0 1 0 0, remanent magnetization states change such that some 1's become 0's and some 0's become 1's. This might appear as 1 1 0 0 0 1 0 0. Here, the value in the 3rd position from the left is a random error. A run of errors due to a burst source may cause the string to appear as 1 1 1 1 1 1 1 0. Here, positions 4, 5 and 7 are actually in error. Although position 6 is not in error, this is considered to be a burst of length four. This means, the first and last bits of the burst determine its length, the inner bits may be in error or not.
The presence of a single random error along with a burst of errors may result in incorrect decoding when using conventional burst-correcting codes, like Fire codes. The traditional solution of interleaving Reed-Solomon (RS) codes may be too expensive for channels requiring correction of one burst together with a random error.
It can be seen then that there is a need for a method, apparatus and program storage device for correcting a burst of errors together with a random error using cyclic or shortened cyclic codes.