Data compression is an extremely useful tool for storing and transmitting large amounts of data. For example, the time required to transmit an image, such as a facsimile transmission of a document, is reduced drastically when compression is used to decrease the number of bits required to recreate the image.
In some compression systems, an input file or set of data is translated into a sequence of decisions under the direction of a decision model. Each decision has an associated likelihood, and based on this likelihood, an output code is generated and appended to the compressed file. To implement these encoding systems, compression systems have three parts: a decision model, a probability estimation method and a bit-stream generator. The decision model receives the input data and translates the data into a set of decisions which the compression system uses to encode the data. The decision model is typically referred to as a context model. The probability estimation method is a procedure for developing the probability estimate for the likelihood of each decision. The bit-stream generator performs the final bit-stream encoding to generate the output code which is the compressed data set or compressed file. Compression can effectively occur in either or both the decision model and the bit generator.
One compression technique widely employed is arithmetic coding. Arithmetic coding maps a string of data (i.e., a "message") to a code string in such a way that the original message can be recovered from the code string. For a discussion on arithmetic coding, see Glenn G. Langdon, Jr., "An Introduction to Arithmetic Coding," IBM Journal of Research and Development, vol. 28, no. 2 (March 1984). One desirable feature of some prior arithmetic coding systems is that compression is performed in a single sequential pass over the data without a fixed set of statistics to code the data. In this manner, arithmetic coding is adaptive.
A binary arithmetic coder is one type of arithmetic coding system. In a binary arithmetic coding system, the selection of a symbol from a set of symbols can be encoded as a sequence of binary decisions. An example of a binary arithmetic coder is the Q-coder developed by IBM of Armonk, N.Y.
Finite state machine (FSM) coders have been used in the prior art to provide efficient entropy coding for single bits with an associated probability estimate. Some of these FSM coders have been implemented as look-up tables (LUTs). For example, see U.S. Pat. Nos. 5,272,478 and 5,363,099. For an example of a finite state machine that performs channel modulation and error correction with entropy coding, see U.S. Pat. No. 5,475,388.
Generally, finite state machines using LUTs are not fast for multi-bit symbols. For instance, if a number between 0 and 7 inclusive is to be coded, the number must be separated into a minimum of three bits, thereby requiring three separate table look-ups. The cumulative effort of the three separate look-ups is to slow down the encoding process. What is needed is to be able to avoid multiple table lookups while encoding multi-bit symbols.
Huffman coding provides for m-ary coding in which a multi-symbol is encoded and/or decoded. Huffman coding creates variable length codes that are in integral (non-fractional) number of bits. In other words, there is no time when the encoder contains information which effects some of the bits that have yet to be output. Symbols with higher probabilities get shorter codes.
Data encoding and decoding are very time intensive operations. In many systems, the probability estimation is performed using a table. When both the probability estimation and entropy encoding are implemented as LUTs, separate look-ups are required, without parallelism. It is desirable to avoid separate table look-ups if possible to reduce the amount of time to perform the probability estimation and entropy coding of the prior art.
In the prior art, entropy coding has typically been performed fast, but without the best compression via, for instance, Huffman coding, or have been fully adaptive, yet slower via, for instance, arithmetic coding. It is desirable to speed up such operations while remaining adaptive.
The present invention provides increased speed for entropy coding using a finite state machine coder that is capable of accommodating n-bit inputs. The present invention further provides coding of m-ary symbols like Huffman coding except with a non-integral (fractional) number of bits. The present invention also provides one-table look-up for both probability estimation and bit generation rather than two separate operations.