1. Field of the Invention
This invention relates to a system for encoding binary data words into codewords that satisfy prescribed constraints for transmission or storage and thereafter for decoding the codewords into the original binary data words. In particular, this invention relates to a system of encoding and decoding data which permits cascaded decoding for multiple codes.
2. Description of Related Art
In digital transmission systems and in magnetic and optical recording/playback systems, the information to be transmitted or to be recorded is presented as a bit stream sequence of ones and zeros. In optical and magnetic recording systems, the bit stream written into the device must satisfy certain constraints. A common family of constraints are the (d,k) runlength-limited (RLL) constraints, which specify that the run of zeros between consecutive ones in the bit stream must have a length of at least d and a length of no more than k for the prescribed parameters d and k. Currently, it is common for a compact disk or DVD (digital versatile disc) to use a code with the constraint (d,k)=(2, 10). An example of a sequence satisfying the (2,10) constraint is . . . 00010000000000100100000100 . . . in which the first four runlengths are 3,10,2 and 5. Magnetic recording standards include the (1,7)-RLL constraint, the (1,3)-RLL constraint and the (2,7)-RLL constraint.
The set of all sequences satisfying a given (d,k)-RLL constraint can be described by reading off the labels of paths in the labeled directed graph as shown in FIG. 1. The parameter k is imposed to guarantee sufficient sign changes in the recorded waveform which are required for clock synchronization during read-back. The parameter d is required to minimize inter-symbol interference.
Another type of constraint requires controlling the low frequency or DC component of the input data stream. The DC control is used in optical recording to avoid problems such as interference with the servo system and to allow filtering of noise resulting from finger-prints. Information channels are not normally responsive to direct current and any DC component of the transmitted or recorded signal is likely to be lost. Thus, the DC component of the sequence of symbols should be kept as close to zero as possible, preferably at zero. This can be achieved by requiring the existence of a positive integer B such that any recorded sequence w.sub.1 w.sub.2 . . . w.sub.l now regarded over the symbol alphabet {+1,-1} will satisfy the inequality ##EQU1##
for every 1.ltoreq.i.ltoreq.j.ltoreq.l. Sequences that obey these conditions are said to satisfy the B-charge constraint. The larger the value of B, the less reduction there will be in the DC component.
However, in certain applications, the charge constraint can be relaxed, thus allowing higher coding rates. In such applications, the DC control may be achieved by using a coding scheme that allows a certain percentage of symbols (on the average) to reverse the polarity of subsequent symbols . Alternatively, DC control may be achieved by allowing a certain percentage of symbols on average to have alternate codewords with a DC component which is lower or of opposite polarity.
DC control and (d,k)-RLL constraints can be combined. In such schemes, the constraint of binary sequences z.sub.1 z.sub.2 z.sub.3 . . . z.sub.l that satisfy the (d,k)-RLL constraint, such that the respective NRZI sequences EQU (-1).sup.z1 (-1).sup.z1+z2 (-1).sup.z1+z2+z3 . . .
have a controlled DC component.
Referring to FIG. 2 shows a functional block diagram of a conventional encoding/decoding system 200. In a typical example of audio data recorded onto a CD, analog audio data from the left and right audio inputs 202a, 202b of a stereo system are converted into 8 bit data signals which are input into a data scrambler and error correction code generator whose output 210 is transmitted into an encoder 212 comprised of a channel encoder 214 and a parallel-to-serial converter 216. The serial data 220 is written to a compact disk 222. A similar process is used to decode data from the CD. Data 224 from the CD is input into a decoder 230 comprised of a serial to parallel converter 230 and a channel decoder 232. Data from the CD is decoded, input into an error corrector and descrambler 238 and output as audio data 240.
The encoder 212 is a uniquely-decodable (or lossless) mapping of an unconstrained data stream into a constrained sequence. The current standard for encoding compact disk data is eight-to-fourteen modulation (EFM). Using EFM encoding, blocks of 8 data bits are translated into blocks of 14 data bits, known as channel bits. EFM uses a lookup table which assigns an unambiguous codeword having a length of 14 bits to each 8-bit data word. By choosing the right 14-bit words, bit patterns that satisfy the (2, 10) constraint, high data density can be achieved. Three additional bits called merge bits are inserted between the 14 bit codewords. These three bits are selected to ensure the (2, 10) constraint is maintained and also to control the low frequency or DC content of the bit stream. The addition of these three merge bits makes the effective rate of this coding scheme 8:17 (not 8:14).
The standard for encoding DVD data is the EFMPlus scheme. (See, for example, K. A. S. Immink, "EFMPlus: The coding format of the multimedia compact disc," IEEE Transactions on Consumer Electronics 41 (1995), pp. 491-497.) Using EFMPlus encoding, blocks of 8 data bits are translated into 16 bits by a four-state finite-state machine that uses a look-up table of size 1,376. By judiciously selecting the codewords in the table and by keeping track of the states, the (2, 10)-RLL constraint is maintained, along with control of the DC content of the output bit stream.
Demands for higher data density are increasing with the advent of multimedia, graphics-intensive computer applications and high-quality digital video programming. European Patent Application 96307738.3, Ron M. Roth, entitled "Method and Apparatus for Generating Runlength-limited Coding with DC Control", published May 2, 1997, as EP 0 771 078 A2, describes a lossless coding scheme that maps unconstrained binary sequences into sequences that obey the (d,k)-RLL constraint while offering a degree of DC control. The lossless coding scheme provides a method and apparatus for encoding and decoding binary data which increases information density relative to EFM coding and minimizes the overall DC component of the output constrained sequences. Further, the coding scheme attempts to minimize the memory required for the encoding and decoding tables. Memory size is decreased compared to the EFM and EFMPlus coding schemes. Specifically, in the (2, 10)-RLL case, the table size is only 546 codewords.
In Roth, the channel encoder is a state machine which uses a single "overlapping" table for all states rather than using multiple tables. Recognizing that a subset of codewords in a first state x.sub.i are identical to a subset of codewords in the second state x.sub.j, the overlapping encoding table uses identical addresses for the subset of identical codewords in the first and second state. Thus addresses for more than one state may point to a single codeword. A number of input bytes can be encoded into two different codewords which have different parity of ones, thus allowing for DC control. Decoding is carried out in a state-independent manner.
The encoder is a finite-state machine that maps input blocks to codewords. The encoder design is based on a method of choosing codewords and their sequence using state splitting, state merging and state deletion techniques such that a single table may be constructed for mapping unconstrained binary sequences into sequences that obey a (d,k) runlength constraint (here with d=2 and k=10, or 12) and a fixed-rate (either 8:16 or 8:15). The encoder is a finite-state machine consisting of four or more states. The encoder can achieve DC Offset control by choosing between output codewords with opposite "parity."
The main building block of the encoder is a table of codewords that serves all states. It has a simple addressing scheme for selection of a codeword or its opposite parity codeword, which simplifies the address circuitry. Encoding is carried out by prefixing the input block with a fixed number of bits which depend on the current encoder state and on comparing the input block to thresholds which are specified, as part of the encoder structure, by fixed threshold tables. The result is an address to the table from which the current encoded codeword is taken. Assuming random input, the probability of being at any given state is independent of the previous state, which allows advantage to be taken of the statistical randomness of the data.
The encoder features DC control by allowing for a number of input blocks to have two possible encoded codewords. The parity (number of 1's) is different in the two possible codewords and so the respective NRZI sequences end with a different polarity, thus allowing the reversal of the polarity of subsequent codewords. The ability to replace codewords with codewords of opposite polarity allows control of the accumulating DC offset. Because the "final bits" or "final run" of all codewords and their opposite parity codewords are matched, subsequent encoding is not affected by which is chosen (a codeword or its opposite parity mate). This facilitates using "look ahead" to optimize DC control. In those cases where DC control is possible, the address of the alternate codeword is obtained by adding a fixed number to the computed address.
By using a single "overlapping" table for all states rather than using multiple tables, the Roth encoder and decoder system has many advantages over prior systems. However, the particular single table to be used by Roth is specific to a particular coding scheme. In some cases, one may need to encode or decode binary sequences using more than one possible coding scheme. One approach would be to have more than one encoder or decoder, i.e., have a separate encoder or decoder for each scheme. The duplication required with such an approach increases attributes associated with the encoder or decoder system such as cost, size, and the like.
A method and apparatus of encoding and decoding binary data is needed which permits encoding and decoding using multiple coding schemes, but which employs single encoding and decoding tables.