More and more digital data are being stored on a given size of magnetic disk. The data, in the form of bits, are stored as a sequence of flux reversals--with, for example, a "1" being recorded as a flux reversal and a "0" being recorded as the absence of flux reversal. As the density of the data stored on the disk increases, so does the likelihood that adjacent or nearby flux reversals will adversely interfere with one another. Such interference, which is referred to as "inter-symbol interference," may cause the bits to be misinterpreted, and may thus result in loss of the underlying data.
To protect against misinterpretation, the data are typically stored on the disk in encoded form. Prior to recording, multiple-bit data symbols are encoded using an error correction code (ECC). When the data symbols are retrieved from the disk and demodulated, the ECC is employed to, as the name implies, correct the erroneous data.
Specifically, before a string of k data symbols is written to a disk, it is mathematically encoded using an (n,k) ECC to form n-k ECC symbols. The ECC symbols are then appended to the data string to form n-symbol error correction code words--data symbols plus ECC symbols--and the code words are written to, or stored on, the disk. When data are read from the disk, the code words containing the data symbols are retrieved and mathematically decoded. During decoding, errors in the data are detected and, if possible, corrected through manipulation of the ECC symbols [For a detailed description of decoding see Peterson and Weldon, Error Correction Codes, 2d Edition, MIT Press, 1972].
To correct multiple errors in strings of data symbols, ECCs that efficiently and effectively utilize the various mathematical properties of sets of symbols known as Galois Fields are typically used. Galois Fields are represented "GF(P.sup.q)", where "P" is a prime number and "q" can be thought of as the number of digits, base P, in each element or symbol in the field. "P" usually has the value 2 in digital computer applications and, therefore, "q" is the number of bits in each symbol.
The number of symbols that the ECC can effectively encode and correct, or "protect", is limited by the size of the Galois Field selected, i.e. 2.sup.q symbols, and the maximum number of errors that the code is capable of correcting. The maximum length of, for example, a code word of a Reed-Solomon ECC over GF (2.sup.q) is 2.sup.q -1 symbols. Thus the maximum number of data symbols that can be protected by the ECC, i.e., included in the code word, is 2.sup.q -1 symbols minus "n", where "n" is the number of ECC symbols required to correct the maximum number of errors. The larger the Galois Field, the longer the code word, and the more data the ECC can protect for a given maximum number of errors to be corrected. Therefore, larger Galois Fields could be used to protect longer strings of data symbols.
Computers typically are designed to manipulate 8-bit data symbols, or bytes. The ECCs over GF(2.sup.8) are, however, too short to protect all of the 8-bit symbols that are currently recorded in a sector of a disk. Specifically, a total of 512 symbols are recorded in a sector, and an ECC over GF(2.sup.8) produces code words that are 2.sup.8 -1, or 255, symbols long. Accordingly, to protect a sector, the ECC over GF(2.sup.8) must be manipulated by, for example, interleaving. As the data density further increases, the number of times the code must be interleaved increases, which adds to the complexity of the error correction encoder. Such manipulation of the code also increases the complexity of the decoder, and the time it takes the decoder to decode the code words and correct any errors.
To avoid multiple interleaving, a longer code may be used. Thus the system may use a code over GF(2.sup.9), which produces code words of up to 2.sup.9 -1, or 511, symbols, and with slight manipulation code words of 512 symbols. One problem with using a longer code is that the code produces longer ECC symbols, and requires longer data symbols for encoding.
System components other than the error correction encoder and decoder are typically set up to handle bytes or multiples of bytes, not the longer 9-bit symbols. Thus, there is a system constraint on the size of the data symbols. These system components, however, do not generally handle the ECC symbols, and there is no such constraint on the length of these symbols.
A prior known system using the longer code encodes 8-bit data symbols as 9-bit symbols by assigning each 8-bit data symbol a first bit that has a predetermined value of, for example, 1. This first, known bit need not be recorded as part of the data symbol, since the bit can be supplied when it is needed by the ECC encoder or decoder. The encoder and the decoder thus annex the predetermined bit to the 8-bit data symbols before these symbols are manipulated to produce ECC symbols. The other system components require only the 8 data bits.
The ECC encoder encodes the 9-bit data symbols and produces 9-bit ECC symbols with first bits that may be either 0's or 1's. Accordingly, these first bits must be retained, and the ECC symbols are recorded as 9-bit symbols.
After the data have been encoded into an error-correction code word and before the code word is recorded, the code word symbols are again encoded using a rate b/b+i run length limited modulation code. This code manipulates groups of "b" bits and forms for each group a (b+i)-bit "cell," which meets the run length limitations of the code for numbers of consecutive 1's and 0's. The number of consecutive 0's is limited to ensure that the decoder can recover a clock from the stored information, and the number of consecutive 1's is limited essentially to minimize inter-symbol interference. The system concatenates the cells to form a modulation code word that meets the code's run length limitations. This code word is then recorded on the disk, as a series of flux reversals.
A prior known system utilizes an 8/9 rate modulation code to encode 8 bits of the error correction code word into a 9-bit cell for recording. This code encodes an 8-bit data symbol to a 9-bit cell, and a 9-bit ECC symbol to two cells. The first ECC symbol, for example, has 8 of its bits encoded to produce one cell and 1 of its bits encoded along with 7 of the bits of the second ECC symbol to produce a second cell, and so forth.
If the system incorrectly demodulates one of the cells that is associated with a data symbol, the result is a one symbol error in the error-correction code word. If, however, the system incorrectly demodulates a cell that is associated with two ECC symbols, the result is two erroneous error-correction code word symbols. This is what is known as "demodulation error propagation." If too many of the code word symbols are erroneous, the error correction code word becomes uncorrectable.
To solve the demodulation error propagation problem, the system could use a longer modulation code. For example, the system could use a 9/10 rate modulation code, which encodes 9-bit symbols to 10-bit cells. However, this longer code splits all but the first data symbol between two cells, and thus, has an error propagation problem that is worse than the previous system, since it encodes a greater number of data symbols than ECC symbols. Conversely, the 9/10 rate modulation code could be used to encode separately each of the code word symbols, producing for each symbol an associated 10-bit cell. It would thus encode an 8-bit data symbol to an associated 10-bit cell, and not split the symbol between two cells. This avoids demodulation error propagation, however, it is an inefficient use of disk space.
Alternatively, two modulation codes could be used--an 8/9 rate code for the data symbols and a 9/10 rate code for the ECC symbols. This solves both the inefficiency and the error propagation problems; however, it requires that the modulation and demodulation circuitry perform two different encoding and decoding operations. The time and complexity involved in performing these two operations makes this an unworkable solution.