1. Field of the Invention
This invention relates to digital data representation and, more particularly, to coding schemes that can be used to represent sequences of binary symbols for data storage and data transmission systems.
2. Description of the Related Art
The design of communication and storage systems requires a choice of modulation and coding. In conventional storage and communication systems, the most commonly used modulation scheme is called Non Return to Zero (NRZ). Systems based on NRZ modulation use special codes to represent data. The most commonly used codes are called Run Length Limited (RLL) codes.
A. NRZ (Non Return to Zero) Modulation
In data storage systems and data communication systems, data is stored or transmitted through the modulation of some physical quantity. In the context of data storage in magnetic media, the modulated quantity is the polarity of bar magnets that are created on the magnetic media by a write head. For data storage in recordable optical media, the quantity that is modulated to store data is the disc reflectivity. In the context of fiber optic modulation, the quantity that is modulated in order to transmit data is intensity of a laser beam. In the context of wireless communication, the quantity that is modulated to transmit data is either the frequency, amplitude, or phase of a radio frequency (RF) signal. In wire line communication, the physical quantity that is modulated to transmit data is voltage of a signal.
For purposes of storage and communication in the above contexts, different modulation schemes can be used. The most commonly used modulation scheme is called Non Return to Zero (NRZ). The salient feature of NRZ is that the signal is binary (has two states) and state transitions can occur only at regular periods in time. Thus the time between any two transitions is always an integer multiple of some time constant as is illustrated in FIG. 1a. The modulated signal waveform 102 in FIG. 1a is shown with the horizontal axis 104 representing time and the vertical axis 106 representing the modulated physical quantity. NRZ signals can be used to represent binary data using two different conventions. In the first convention, one of the two states corresponds to a logical zero and the other corresponds to a logical one. This is illustrated in FIG. 1b. In the second convention, a state transition is used to represent a logical one and the absence of a state transition is used to represent a logical zero. This is illustrated in FIG. 1c. It should be noted that all binary signals can be uniquely represented by NRZ signals using either convention. In the rest of this document, the later convention will be used.
B. RLL (Run Length Limited) Codes
Probably the most commonly used RLL coding scheme is referred to as the (d, k) coding scheme, described further below. As noted above, coding schemes are needed for communication and storage systems that use NRZ modulation, which are based on clocked circuits. In order to map the modulated quantity to binary data, one needs to sample the 25 modulated quantity at regular periodic intervals in time. An internal clock determines these sampling points. Due to practical limitations, the internal clock usually has some error (in clock period). The clock error causes the difference between the points in time at which the signal is to be sampled versus the points in time at which the signal is actually sampled to increase with time. This phenomenon is referred to as clock drift. This problem is typically accommodated by means of a phased lock loop that resynchronizes the internal clock with the NRZ modulated signal every time there is a transition in the modulated quantity.
To ensure proper functioning of the clock circuit, constraints are placed on the binary sequence in order to ensure that there will be at least one transition within some fixed period of time. Such constraints can be characterized as limiting the maximum number of zeros (represented by “k” in the (d, k) coding scheme) between any two adjacent ones. Other engineering constraints may also force a limitation on the minimum number of zeros (represented by “d”) between two adjacent ones. The RLL codes are one coding mechanism for mapping arbitrary binary sequences to longer sequences that satisfy the constraints mentioned above in a unique and efficient manner. Because the mapping (encoding) is unique, an inverse of the code is used for decoding. In this way, the original binary data can be recovered.
One scheme used in the context of magnetic data storage is referred to as a (2, 7) RLL code, because the coding scheme ensures that any consecutive data one bits in an unconstrained (input) signal are mapped to a sequence in which adjacent one bits are separated by at least two zero bits, but no more than seven consecutive zero bits. Thus, the set of permissible time intervals from one signal transition to the next for the (2, 7) code is the code symbol set S of consecutive integer multiples of a clock period, where S is specified by Table 1:
TABLE 1S = {3, 4, 5, 6, 7, 8}
In other words, the minimum number of time intervals from one transition to the next is three, which occurs where data is encoded as the sequence “1001”, and the maximum number of time intervals permitted is eight, which occurs where data is encoded as “100000001”. A data device employing a (2, 7) code will map the unconstrained (not yet encoded) bits of a data packet to a bit sequence that is constrained in accordance with the (2, 7) code.
All the factors mentioned above influence the design of data transmission systems like optical and wireless transceivers and data storage systems like magnetic and optical disk drives. Most RLL codes map a block of data bits into a fixed number of constrained data bits, such as mapping all combinations of three unconstrained bits into blocks of five bits, comprising a fixed-length-to-fixed-length code. Other known coding schemes provide a variable length to fixed block code, or block to variable length code.
The RLL coding schemes typically under-utilize available bandwidth in most data channels in use today. The RLL coding scheme recognizes that if two consecutive 1's in an input (unconstrained) sequence are encoded into a constrained sequence in which the 1's are too close together, the two 1's will merge and will be read or interpreted as a single one bit rather than two.
The conventional specification of RLL codes establishes the (d, k) parameters so as to stop coding the constrained sequence at the first sign of ambiguity. For example, the (2, 7) RLL code is established in view of the fact that, in the data channel for which the (2, 7) RLL code is intended, seven consecutive zero bits can be resolved successfully, but eight bits cannot. That is, the conventional (2, 7) RLL code is specified in terms of a resolution ambiguity parameter that permits consecutive 0's in the constrained sequence only until the first ambiguity in resolving the zero bits is reached. This is the k value in the (d, k) scheme. The (2, 7) RLL code and similar coding schemes cannot take advantage of the fact that, even if eight consecutive zero bits cannot be resolved successfully, it might be possible that nine or ten or other numbers of consecutive bits greater than eight could be resolved successfully. Thus, there is additional unused capacity in the data channel that cannot be exploited by the conventional coding schemes.
From the discussion above, it should be apparent that there is a need for coding schemes that more fully utilize capacity in a data channel. The present invention satisfies this need.