Information, such as signals representing voice data, video or text, must typically be processed before the information can be transmitted over a communications channel or recorded on a medium. First, the information, if not already in digital form, is digitized, for example by an analog-to-digital converter. Next, the digital information may be "compressed" to represent the information by a fewer number of bits. Any savings due to compression are, however, partially offset by processing the compressed information using error correcting codes. Error correcting codes introduce additional bits to a signal to form an encoded signal. The additional bits improve the ability of a system to recover the signal when the encoded signal has been corrupted by noise introduced by a communications channel or by a recording medium.
A further type of coding used in transmission and recording systems is modulation coding. As with error correcting codes, modulation coding can improve a system's immunity to noise. Modulation codes also can advantageously be used to regulate timing and gain parameters in recording and communications systems.
For example, consider a system which reads information stored on a magnetic medium. In non-return-to-zero-inverse (NRZI) recording, for example, a binary "1" is recorded on a portion of the magnetic medium by causing a change in the magnetization or magnetic flux of that portion of the medium. A binary "0" is recorded by causing no change in magnetization. The bits are read by detecting a sequence of changes in a voltage signal caused by changes in the magnetization of portions of the medium. The voltage signal, however, may be corrupted by noise in the recording system. The voltage is typically a pulse each time a "1" is detected and just noise each time a "0" is detected. The position of the pulses is used to set timing parameters in the system, and the height of the pulses is used to set gain parameters in the system. If, however, a long string of zeros is read, there is no voltage output other than noise, and hence no timing or gain information, thereby leading to a loss of, or drift in, timing and gain parameters in the system.
Modulation coding thus may be used to ensure that the recording or transmission of a long string of binary zeros is avoided. Modulation coding may be implemented, for example, by dividing digital information that is to be recorded into sets of bits, called information words. Each information word is then used to select a codeword in a codebook. The codewords in the codebook are of length N bits where the codeword bits define a channel sequence, in other words, a sequence of symbols to be sent over a channel. For example, a binary "1" in a codeword may represent the symbol "-1" or negative magnetic flux, and a binary "0" in a codeword may represent the symbol "+1" or positive magnetic flux. If the codewords in the codebook do not contain a long string of zeros, then the selected codewords recorded on the medium will likewise not contain a long string of zeros, thereby obviating the timing and gain control problem.
Additionally, it is often desirable to use channel sequences that have a spectral null at zero (dc) frequency by which it is meant that the power spectral density function of the channel sequence at dc equals zero. Such sequences are said to be dc-free. One way to assure a dc-free sequence is to design a system in which the block digital sum, or the arithmetic sum, of symbols in a codeword transmitted over a channel is zero. However, efficient or high-rate modulation codes that can prevent long strings of zeros from occurring without adding an excessive number of redundant bits to the information to be recorded, and that are dc-free typically require both codewords and codebooks of larger sizes as discussed further below.
It is known that the power spectral density function of a channel sequence x, where x=. . . x.sub.-1,x.sub.0,x.sub.1, . . . , vanishes at zero frequency if and only if its running-digital-sum (RDS), defined as ##EQU1## is bounded. It is also known, for example, how to translate sequences of symbols from the symbol alphabet of the error-correcting code symbols into channel sequences with bounded RDS's by means of dc-free modulation codes which may be finite-state codes or block codes. Block codes, for example, take blocks of M symbols, called information words, and map them into blocks of N channel symbols or sequences called codewords. Several factors favor the use of block codes. One such factor is limited error propagation since the symbols used to encode one block are not used in encoding any other block and thus errors in encoding are typically confined to a particular block. Another factor is ease of implementation. One way to organize the mapping of information words to codewords is to form a codebook or look-up table of 2.sup.M codewords and use an M-bit input word to specify or address an N-bit codeword in the codebook. The ratio M/N defines the rate R of the modulation code.
To ensure that an arbitrary sequence of codewords has a bounded RDS, each codeword w=w.sub.0,w.sub.1, . . . W.sub.N is required to have a block digital sum (BDS), defined as ##EQU2## equal to zero. Codewords of bipolar symbols, for example, +1 and -1, and having a BDS equal to zero, are possible only if the codeword length N is even and if half the symbols are -1 and half the symbols are +1. The number of such codewords is then equal to ##EQU3## where ##EQU4## However, at most 2.sup.M codewords having a BDS equal to zero can be used to form a codebook for an M/N rate code, where M=floor ##EQU5## and where the function floorx! returns the largest integer less than or equal to x. The code rate R=M/N indicates that for every M information bits, N channel bits are generated, with N.gtoreq.M.
The above explanation is rendered more clear by use of a specific example. Consider a sequence having a block length N equal to 4. There are 16 possible sequences, 6 of which are dc-free. By using a block length of four bits, however, the value of M equals 2, and two of the dc-free codewords will not be used in the codebook. In some cases, the requirement that M=floor ##EQU6## causes a substantial number of extra dc-free sequences not to be used. For example, if N=8, the number of dc-free sequences is 70, but the codebook is of size 64 and thus 6 dc-free sequences are not used. Similarly, if N=10, there are 252 possible dc-free sequences. The codebook, however, is of size 128. Thus, 124 sequences are discarded thereby lowering the code rate from approximately 0.8 to 0.7.
Furthermore, in magnetic recording applications, it is desirable that modulation codes have rates higher than 3/4 so that more information can be written on the recording medium. Codes having a relatively long block length are required for rates above 3/4. Also, large codebooks are required where the codewords in the codebooks are dc-free. For example, a code of rate 11/14 requires a block length of 14 and a codebook size of 2048, and a code of rate 13/16 requires a block length of 16 and a codebook of size 8192. Such large codebooks, however, typically require the implementation of more complex circuitry and often require large power consumption and large area on integrated circuits relative to other elements in the transmission or recording system. Also, the larger the codebook, the more time it takes to access codewords in the codebook. Although some techniques have been proposed to reduce the size of the codebooks, these techniques add additional complexity and do not substantially reduce the size of the codebooks. Thus, there is a need for a method and apparatus for generating high rate codes that are dc-free and suitable for recording information on a magnetic medium.