This invention relates to a system for encoding binary data words into codewords that satisfy prescribed constraints for transmission and thereafter for decoding the codewords into the original binary data words. In particular, this invention relates to a system of encoding and decoding data which increases information density, minimizes the overall DC component of the transmitted digital code, and minimizes the memory required for the coding table.
In digital transmission systems and in magnetic and optical recording/playback systems, the information to be transmitted or to be recorded is presented as a bit stream sequence of ones and zeros. In optical and magnetic recording systems, the bit stream written into the device must satisfy certain constraints. A common family of constraints are the (d,k) runlength-limited (RLL) constraints, which specify that the run of zeros between consecutive ones in the bit stream must have a length of at least d and a length of no more than k for the prescribed parameters d and k. Currently, it is common for a compact disk to use a code with the constraint (d,k)=(2,10). An example of a sequence satisfying the (2,10) constraint is . . . 00010000000000100100000100. . . in which the first four runlengths are 3, 10, 2 and 5.
Magnetic recording standards include the (1,7)-RLL constraint and the (1,3)-RLL constraint.
The set of all sequences satisfying a given (d,k)-RLL constraint can be described by reading the labels off of paths in the labeled directed graph as shown in FIG. 1. The parameter k is imposed to guarantee sufficient sign changes in the recorded waveform which are required for clock synchronization during read-back. The parameter d is required to prevent inter-symbol interference.
Another type of constraint requires controlling the low frequency or DC constant of the input data stream. The DC control is used in optical recording to avoid problems such as interference with the servo system and to allow filtering of noise resulting from finger-prints. Information channels are not normally responsive to direct current and any DC component of the transmitted or recorded signal is likely to be lost. Thus, the DC component of the sequence of symbols should be kept as close to zero as possible, preferably at zero. This can be achieved by requiring the existence of a positive integer B such that any recorded sequence w.sub.1 w.sub.2 . . . w.sub.l now regarded over the symbol alphabet {+1,-1} will satisfy the inequality ##EQU1## for every 1.ltoreq.I.ltoreq.j.ltoreq.l. Sequences that obey these conditions are said to satisfy the B-charge constraint. The larger the value of B, the less reduction there will be in the DC component.
However, in certain applications, the charge constraint can be relaxed, thus allowing higher coding rates. In such applications, the DC control may be achieved by using a coding scheme that allows a certain percentage of symbols (on the average) to reverse the polarity of subsequent symbols. Alternatively, DC control may be achieved by allowing a certain percentage of symbols on average to have alternate codewords with a DC component which is lower or of opposite polarity.
DC control and (d,k)-RLL constraints can be combined. In such schemes, the constraint of binary sequences z.sub.1 z.sub.2 z.sub.3 . . . z.sub.l that satisfy the (d,k)-RLL constraint, such that the respective NRZI sequences EQU (-1).sup.z1 (-1).sup.z1+z2 (-1).sup.z1+z2+z3 . . .
have a controlled DC component.
Referring to FIG. 2 shows a functional block diagram of a conventional encoding/decoding system 200. In a typical example of audio data recorded onto a CD, analog audio data from the left and right speakers 202a, 202b of a stereo system are converted into 8 bit signal which is input into a data scrambler and error correction code generator whose output 210 is transmitted into an encoder 212 comprised of a channel encoder 214 and a parallel-to-serial converter 216. The serial data 220 is written to a compact disk 222. A similar process is used to decode data from the CD. Data 224 from the CD is input into a decoder 230 comprised of a serial to parallel converter 230 and a channel decoder 232. Data from the CD is decoded, input into an error corrector and descrambler 238 and output as audio data 240.
The encoder 212 is a uniquely-decodable (or lossless) mapping of an unconstrained data stream into a constrained sequence. The current standard for encoding compact disk data is eight-to-fourteen modulation (EFM). Using EFM encoding, blocks of 8 data bits are translated into blocks of 14 data bits, known as channel bits. EFM uses a lookup table which assigns an unambiguous codeword having a length of 14 bits to each 8-bit data word. By choosing the right 14-bit words, bit patterns that satisfy the (2,10) constraint, high data density can be achieved. Three additional bits called merge bits are inserted between the 14 bit codewords. These three bits are selected to ensure the (2,10) constraint is maintained and also to control the low frequency or DC content of the bit stream. The addition of these three merge bits makes the effective rate of this coding scheme 8:17 (not 8:14).
Demands for higher data density are increasing with the advent of multimedia, graphics-intensive computer applications and high-quality digital video programming. A proposal described in the article "EFMPlus: The Coding Format of the MultiMedia Compact Disc", Proc. 16th Symp. on Inform. Theory in the Benelux, Nieuwerkerk a/d Yssel, May 18-19, 1995, describes an encoding/decoding system which increases data density compared to EFM coding. In the system proposed in the EFMPlus article, both the encoder and decoder for constrained data take the form of a finite-state machine. A rate p:q finite-state encoder accepts an input block of p-bits and generates a q-bit codeword depending on the input block and the current state of the encoder. The sequences obtained by concatenating the generated q-bit codewords satisfy the constraint. In optical storage devices, the p-bit input block is typically taken to be an 8 bit byte so that it matches the unit size used in the error-correction scheme.
The proposed EFMPlus scheme is a rate 8:16 finite state encoder for the (2,10)-RLL constraint which increases its data density compared to the EFM scheme. The encoder is however a more complex four state encoder with each state requiring 256+88 sixteen bit codewords. (The 88 codewords are alternate codewords which are used to control the DC content.)
A method and apparatus of encoding and decoding binary data which increases information density, minimizes the overall DC component of the transmitted digital code, and minimizes the memory required for the encoding and decoding tables is needed.