The present invention relates generally to a method of and apparatus for providing encoding and decoding for either modulation coding or data compression for the storage and communication of digital data. Particularly, the present invention relates to a method of and apparatus for providing modulation coding of digital data for use with magnetic and optical recording and optical fiber communication and to a method of and apparatus for providing data compression for storage of text information in natural or programming languages.
The present invention provides a practical way in which to approach Shannon's theoretical rate limit for constraints on encoded data for modulation coding and for constraints on data to be encoded by data compression, where such constraints are defined by any state transition table. In the case of modulation coding, Shannon's theoretical limit is the capacity of his noiseless constrained channel. See: C.E. Shannon, "Mathematical Theory of Communications," University of Illinois Press, 1963. The present invention provides a particularly effective method and apparatus for modulation coding for any run-length-limiting (d,k) constraints, or any (d,k;c) constraints, as well as for any (d,k) or (d,k;c) constraints used with a synchronization word. The present invention may also be used with simple post modulation error-control codes incorporated into the state transition table of the invention itself.
The use of modulation codes in magnetic and optical data storage systems is widespread and the advantages of modulation coding are well known. For example, in U. S. Pat. No. 3,689,899, a run-length limited variable length code is disclosed.
In magnetic and optical binary recording, a 1 is usually represented by a transition between the up and down states of the recording waveform, while a 0 is represented by no change or transition in the waveform. However, an unconstrained sequence of 1's and 0's is not desirable in practice. For example, physical factors, such as the separation between successive transitions on the recording medium and the gap-width of the magnetic recording head or the wavelength of light in optical recording, determine the minimum allowed number of zeroes (d) between successive ones. On the other hand, a long sequence of uninterrupted 0's results in a loss of synchronization when the data is self-clocking. This synchronization problem determines the maximum number of zeroes (k) between successive ones. Modulation codes satisfying these run length limiting constraints are known as run-length-limited (RLL) modulation (d,k) codes.
Also, the imbalance between the up and down states of the recording waveform may result in a significant charge accumulation in the electronic circuitry which results in forcing the system beyond acceptable levels of operation. Therefore, the implementation of such run-length-limited codes can require, as additional constraints, a maximum allowed absolute value (c) of the difference between the number of zeroes and ones in any sequence of bits to be recorded. Run-length-limited (RLL) codes satisfying such additional constraints are known as (d,k,c) codes.
The objective of any modulation coding is to create a one-to-one correspondence between sequences of user data, which are usually streams of binary digits, and constrained binary sequences. The nature of the constraints imposed upon the modulation signal must be determined by the system designer and is dependent upon the particular characteristics of the system under consideration.
If a synchronization word is used, then additional constraints are placed on the modulation code such that the synchronization word cannot occur in successive bits of data unless those bits constitute the synchronization word in its intended location.
If there are no constraints on the minimum run-length of 0's, that is, if d equals 0, then each modulation bit will occupy one interval of length .DELTA., where .DELTA. is the minimum allowed distance between transitions. Since the number of modulation bits is always greater than the number of data bits, the overall recording density will become less than 1 data bit per .DELTA.. If, on the other hand, d is not equal to 0, then d+1 modulation bits can be packed in every interval of length .DELTA..
When the code parameters are properly chosen, the higher density of modulation bits can translate into a higher density recording of data bits. In that manner, the overall density can become greater than 1 data bit per .DELTA.. However, the cost of such an increase in capacity is the reduced period of time available to each modulation bit. If t is defined as the time that the read/write head dwells on an interval of length .DELTA., then the time window available to each modulation bit will be t/(d+1).
Any of the sets of constraints discussed above, or all of them together, and any other modulation constraints can be represented by a corresponding state transition table. A state transition table is an .vertline.S.vertline..times..vertline.B .vertline. matrix, where .vertline.S.vertline. is the number of states of a finite automation and .vertline.B.vertline. is the number of possible transition symbols. Each element z.sub.s,b of that matrix is either the state to which the automation goes by the transition symbol b from the state s or is a symbol signifying that the transition described by the symbol b is forbidden; that is, the transition by symbol b from state s is not permitted by the modulation constraints. FIG. 1 shows an example of such a state transition table for (2,5) RLL constraints where a state is equal to the number of zeroes after the last "1" bit.
The constrained noiseless channel discussed above has been defined by Shannon as any set of constraints on encoded data (which need not necessarily be binary) characterized by any state transition diagram; the notion of constraints defined by such transition diagram is equivalent to the notion of constraints defined by the state transition table mentioned above. Shannon defined the capacity of such a channel as the theoretical limit of the rate N/L of coding for those constraints, for an information source block of length N and a codeword of length L.fwdarw..infin.. He also proposed a way in which to calculate that capacity for any state transition diagram. However, prior to the present invention, general nonexponentional complexity methods of modulation coding with a rate approaching that capacity have not been known.
The present invention provides a method of and apparatus for encoding and decoding with only linear complexity with a rate approaching Shannon's capacity for any constrained noiseles channel; that is, for modulation constraints defined by any state transition table or by any state transition function.
While run-length-limited (d,k) codes and (d,k;c) codes in general are known, they are only known for few (d,k;c) combinations. Some such codes, and probably the most practical ones, are characterized in Table 1. See the Chapter by Arvind M. Patel in C. Denis Mee and Eric D. Daniel's "Magnetic Recording," Vol. II, McGraw-Hill Book Company, 1988, p. 247.
TABLE 1 ______________________________________ Rates of some known run-length-limited (d,k) and (d,k;c) codes in comparison to Shannon's information capacities of noiseless channels with corresponding run-length-limiting (d,k) or (d,k;c) constraints. d k c Capacity .gtoreq. rate ______________________________________ 0 3 -- 0.947 &gt; 8/9 1 3 -- 0.552 &gt; 1/2 1 3 3 0.500 = 1/2 1 6 -- 0.699 &gt; 2/3 1 7 -- 0.679 &gt; 2/3 1 7 10 0.668 &gt; 2/3 2 7 -- 0.517 &gt; 1/2 2 7 8 0.501 &gt; 1/2 2 8 7 0.503 &gt; 1/2 3 7 -- 0.406 &gt; 2/5 4 9 -- 0.362 &gt; 1/3 5 17 -- 0.337 &gt; 1/3 ______________________________________
Furthermore, only a small number of known (d,k) and (d,k,c) codes have been used commercially. For example, the effective (2,7) RLL code is known and is extensively utilized, but effective (2,7;c) RLL codes are not known. Very few of those utilized (d,k) combinations are optimal. In addition, these known codes from Table 1 are of very different natures. For example, the (2,7) RLL code is defined by a small table of variable-to-variable length coding (but with a fixed rate), whereas the (0,3) RLL code is defined by a large table of fixed-to-fixed length coding. See U.S. Pat. No. 3,689,899, referenced above and Patel, A.M., "Improved Encoder and Decoder for a Byte-Oriented (0,3) 8/9 Code," IBM Technical Disclosures, Vol.28, 546 (1985). An (1,7) RLL code is defined by two mappings. See Horiguchi, T. and K. Morita, "An Optimization of Modulation Codes in Digital Recording," IEEE Trans. on Magn., MAG-12, 740 (1976).
The absence of a systematic and effective way for finding modulation codes has been a major obstacle to the development of systems of magnetic and optical recording with higher storage densities. That obstacle is overcome by the present invention which, in contrast with the prior art, provides an efficient way of producing modulation encoding for modulation constraints defined by any state transition table with maximum possible rate for the given modulation requirements. In particular, the present invention provides such kind of modulation encoding and decoding for any (d,k) and (d,k;c) constrains, any (d,k) and (d,k;c) constraints with an arbitrary synchronization word, or for both of those cases using simple error-correcting codes incorporated in the state transition table. The application of the present invention is similar for use in both magnetic and optical recording and in optical fiber communication.
If modulation encoding is used as decoding and modulation decoding is used as encoding, then the method and apparatus of the present invention provide effective data compression for the general case of information source data constraints defined by an arbitrary state transition table, instead of modulation coding for modulation constraints defined by the same state transition table. While in the past general methods of effective data compression for information source constraints have not been known, the present invention discloses general methods of effective data compression for information source constraints defined by any state transition table.
For the objectives described above the present invention utilizes the first known algorithms both for enumeration of any set of words described by constraints defined by any state transition table and for inverse mapping. An enumeration algorithm forms the main part of the modulation decoding algorithm and of the data compression encoding algorithm. Conversely, the inverse mapping algorithm forms the main part of the modulation encoding algorithm and of the data compression decoding algorithm. Algorithms which provide either enumeration or inverse mapping for any given state transition table (or for a broad class of such tables) and a given initial state were not known prior to this invention. Particularly, algorithms which provide such functions for a class of state transition tables corresponding to (d,k) codes were not known prior to the instant invention.
According to one main method of this invention, after the state transition table corresponding to a given set of constraints has been constructed, the block length L of the modulation codewords is determined. The present invention can handle any choice of L, although longer codewords are preferred since they yield a capacity which is arbitrarily close to Shannon's noiseless channel capacity for the given set of constraints. The main encoding and decoding algorithms cf the present invention are based upon enumeration and mapping blocks of user data of fixed length N to modulation codewords of fixed length L.
The instant invention provides many advantages over known methods for encoding and decoding information on optical and magnetic disks. For example, information rates using the present invention, and hence storage densities, are arbitrarily close to capacity (the theoretical rate limit approached by growing lengths of codewords) for practically any set of constraints on allowed encoded sequences. These constraints can be entered as input in the form of a state transition table. Such closeness to capacity is achieved for all sets of constraints using the invention disclosed herein. Any new or additional set of constraints are entered as a new or additional input in the form of a state transition table without changing the method of the present invention.
If synchronization properties are required, then they can be easily and independently introduced with a negligible decrease in the information rate. That can be accomplished by adding a set of new constraints which correspond to the addition of a synchronization word.
Run-length-limiting and practically any other constraints that define modulation codes without error-correcting properties (or with simple error-correcting properties) can be described by state transition tables of size 2S where S denotes the number of states. For a (d,k) code, .vertline.S.vertline.=k+1. An array of size L.times..vertline.S.vertline..times.2, where L is the codeword length, is produced from this table a single time for all codings and decodings corresponding to this table. It is an advantage of the present invention that this is almost the only memory used by the invention for coding and decoding and that it can be stored in ROM. In contrast, the memory required by a lookup code table is much larger, being of the order of 2.sup.L. In addition, the method and apparatus of the present invention uses only a few operations per bit of data, both for encoding and decoding, and does not require multiplication or division.
In order to implement the synchronization properties of the present invention, an additional state transition table of size 2.vertline.S'.vertline., determined by the properties of a desired synchronization word, can be implemented. That will produce an array of size L.times..vertline.S.vertline..times..vertline.S'.vertline..times.2, which again need only be produced once for all encodings and decodings that use that synchronization word. If the synchronization word is taken as a sequence of k+1 zeroes for (d,k) codes, then the (d,k) code with such synchronization word becomes a (d,k+1) code.