The importance of error correction coding of data in digital computer systems has increased greatly as the density of the data recorded on mass storage media, more particularly magnetic disks, has increased. With higher recording densities, a tiny imperfection in the recording surface of a disk can corrupt a large amount of data. In order to avoid losing that data, error correction codes ("ECC's") are employed to, as the name implies, correct the erroneous data.
Before a string of data symbols is recorded on a disk, it is mathematically encoded to form ECC, or redundancy, symbols. The redundancy symbols are then appended to the data string to form code words--data symbols plus redundancy symbols--and the code words are then stored on the disk. When the stored data is to be accessed, the code words containing the data symbols are retrieved from the disk and mathematically decoded. During decoding any errors in the data are detected and, if possible, corrected through manipulation of the redundancy symbols [For a detailed description of decoding see Peterson and Weldon, Error Correcting Codes, 2d Edition, MIT Press, 1972].
Stored digital code words can contain multiple errors. One of the most effective types of ECC used for the correction of multiple errors is a Reed-Solomon code see Peterson and Weldon, Error Correcting Codes]. Error detection and correction techniques for Reed-Solomon ECC's are well known. Id.
An (n,k) Reed-Solomon ECC encodes k data symbols to form n-k, or r, redundancy symbols. The (n,k) code has a minimum distance of D=r+1, which means that each code word differs from every other code word by at least r+1 symbols. The number of errors which an ECC can correct is directly related to the number of redundancy symbols it produces. Each redundancy symbol can be used to determine, that is, solve for, either an error location or an error value. Thus, two redundancy symbols are required to correct each error. The (n,k) code can correct up to (D-1)/2 errors in an n-symbol code word.
An erasure is an error with a known location. Such errors may, for example, be detected during detection and demodulation of signals retrieved from a magnetic disk drive. The system retains a pointer to the location of the erasure. Accordingly, only one redundancy symbol is required to correct an erasure since only the data value of the erasure is unknown. The (n,k) code can thus correct e erasures, 0.ltoreq.e.ltoreq.D-1, and [(D-1)-e]/2 errors.
A system designer may not be able to find a Reed-Solomon ECC which can correct a desired number of errors in the k data symbols, that is, a code which generates R redundancy symbols by encoding k data symbols. The designer may, for example, find a code which generates R+t redundancy symbols by encoding the k data symbols and formulates code words with k+R+t symbols. Such a code is not acceptable, however, for a system which has allocated to a code word a (k+R)-symbol storage space. Accordingly, the designer may select an ECC with a longer length, for example, an (N,K) code, where N&gt;n and K&gt;k, which produces N-K, or R, redundancy symbols and then truncate the code word by eliminating some of the data symbols. This process is often termed "shortening" of the code, however, we shall use the term truncate to avoid confusion with the system described herein.
Truncating an (N,K) code consists of selecting as valid code words only those code words of the original (N,K) ECC which include "i" leading symbols which correspond with a predetermined pattern, for example, i leading all-zero symbols, where K-i=k. In a binary code, for example, there are 2.sup.K-i of these code words. The selected code words form an (N-i,K-i) sub-code of the (N,K) code. If the i leading symbols are deleted, code words with k data symbols and R redundancy symbols are produced. Since these code words include the same number of redundancy symbols as code words of the original (N,K) code, the same number of errors can be corrected. Truncated codes are discussed in detail in Peterson and Weldon, Error Correcting Codes.
To correct errors in the truncated code words, the system could replace the truncated symbols and employ conventional error correction techniques designed for the (N,K) code. The system thus manipulates the R redundancy symbols to generate error syndromes and determines from these the error locations and error values. Alternatively, since the deleted leading-zero symbols do not affect the manipulations of the redundancy symbols which produce the error correction syndromes, the system can generate the error syndrome without replacing the truncated symbols. However, when the system determines error location using these syndromes, the system must take into account the truncated maximum code word length, which is N-i symbols. Using either approach, the system manipulates the R redundancy symbols to correct up to R/2 code word errors in the N-i code word symbols.
If a known data processing system requires, for certain applications, the capability to correct a large number of errors in the data, the system must use a powerful ECC which produces a large number of redundancy symbols for a given number of data symbols. If the system can sometimes, or for certain applications, operate with a reduced error correction capability it can use a less powerful ECC which produces fewer redundancy symbols for the same amount of data. The system can thus allocate a larger or smaller amount of storage space to a given code word, depending on the error correction requirements.
When a system is storing data on a relatively new disk, for example, the system may use fewer redundancy symbols, that is, a less powerful ECC, to protect the data then it uses to protect data stored on an older, potentially deteriorating disk. Similarly, the system may, in essence, rate the disks according to their overall integrity, or associated error rates, and use more or less powerful ECCs to protect the data stored on the disks, depending on their rankings.
The system may use a more or less powerful code depending on the type of storage medium used, also. A system which has a capability of storing data on both optical and magnetic disks, for example, may use a more powerful code for data which it is storing on the optical disks. The quality of the optical disks is generally not as good as the quality of magnetic disks, and thus, the probability of having a defect in the optical disk is higher, also, and the probability of that defect effecting a relatively large number of data symbols is also higher than the probabilities associated with magnetic disks. Accordingly, the system protects the data with an ECC which can correct a greater number of errors.
In networked data processing systems, data which is to be sent a long distance may be protected with more redundancy symbols then data which is to be sent a shorter distance. The system thus protects the data from transmission errors which are more likely to occur when data is transmitted over the longer communications paths.
To produce both code words with large numbers of redundancy symbols and code words with fewer redundancy symbols, known data processing systems encode the data symbols using different, i.e., more and less powerful, ECCs. Such systems must include different encoders and decoders, for use with the respective ECCs. What is desired is a mechanism by which a system can use a single encoder and decoder to encode and decode these code words.
This is a different problem than the problem solved by truncating, an ECC, since it is the number of code word redundancy symbols which changes not the number of code word data symbols. Accordingly, while the techniques used to generate error syndromes for an (N,K) code can be used with a truncated (N-i, K-i) code, that is, with a code which produces the same number of redundancy symbols, these techniques are not adequate for codes which produce different numbers of redundancy symbols.