Today, data compression is widely used, particularly for storing and transmitting large amounts of data. Many different data compression techniques exist in the prior art. Compression techniques can be divided into two broad categories, lossy coding and lossless coding. Lossy coding involves coding that results in the loss of information, such that there is no guarantee of perfect reconstruction of the original data. In lossless compression, all the information is retained and the data is compressed in a manner which allows for perfect reconstruction.
In lossless compression, input symbols are converted to output codewords. If the compression is successful, the codewords are represented in fewer bits than the number of input symbols. Lossless coding methods include dictionary methods of coding (e.g., Lempel-Ziv), run length encoding, enumerative coding and entropy coding.
Entropy coding consists of any method of lossless coding which attempts to compress data close to the entropy limit using known or estimated symbol probabilities. Entropy codes include Huffman codes, arithmetic codes and binary entropy codes. Binary entropy coders are lossless coders which act only on binary (yes/no) decisions, often expressed as the most probable symbol (MPS) and the least probable symbol (LPS). Examples of binary entropy coders include IBM's Q-coder and a coder referred to as the B-coder. For more information on the B-coder, see U.S. Pat. No. 5,272,478, entitled "Method and Apparatus for Entropy Coding", (J. D. Allen), issued Dec. 21, 1993, and assigned to the corporate assignee of the present invention. See also M. J. Gormish and J. D. Allen, "Finite State Machine Binary Entropy Coding," abstract in Proc. Data Compression Conference, 30 Mar. 1993, Snowbird, Utah, pg. 449. The B-coder is a binary entropy coder which uses a finite state machine for compression.
FIG. 1 shows a block diagram of a prior art compression and decompression system using a binary entropy coder. For coding, data is input into context model (CM) 101. CM 101 translates the input data into a set or sequence of binary decisions and provides the context bin for each decision. Both the sequence of binary decisions and their associated context bins are output from CM 101 to the probability estimation module (PEM) 102. PEM 102 receives each context bin and generates a probability estimate for each binary decision. The actual probability estimate is typically represented by a class, referred to as PClass. Each PClass is used for a range of probabilities. PEM 102 also determines whether the binary decision (result) is or is not in its more probable state (i.e., whether the decision corresponds to the MPS). The bit-stream generator (BG) module 103 receives the probability estimate (i.e., the PClass) and the determination of whether or not the binary decision was likely as inputs. In response, BG module 103 produces a compressed data stream, outputting zero or more bits, to represent the original input data.
For decoding, CM 104 provides a context bin to PEM 105, and PEM 105 provides the probability class (PClass) to BG module 106 based on the context bin. BG module 106 is coupled to receive the probability class. In response to the probability class and the compressed data, BG module 106 returns a bit representing whether the binary decision (i.e., the event) is in its most probable state. PEM 105 receives the bit, updates the probability estimate based on the received bit, and returns the result to CM 104. CM 104 receives the returned bit and uses the returned bit to generate the original data and update the context bin for the next binary decision.
One problem with decoders using binary entropy codes, such as IBM's Q-coder and the B-coder, is that they are slow, even when implemented in hardware. Their operation requires a single large, slow feedback loop. To restate the decoding process, the context model uses past decoded data to produce a context. The probability estimation module uses the context to produce a probability class. The bit-stream generator uses the probability class and the compressed data to determine if the next bit is the likely or unlikely result. The probability estimation module uses the likely/unlikely result to produce a result bit (and to update the probability estimate for the context). The result bit is used by the context model to update its history of past data. All of these steps are required for decoding a single bit. Because the context model must wait for the result bit to update its history before it can provide the next context, the decoding of the next bit must wait. It is desirable to avoid having to wait for the feedback loop to be completed before decoding the next bit. In other words, it is desirable to decode more than one bit or codeword at a time in order to increase the speed at which compressed data is decoded.
Another problem with decoders using binary entropy codes is that variable length data must be processed. In most systems, the codewords to be decoded have variable lengths. Alternatively, other systems encode variable length symbols (uncoded data). When processing the variable length data, it is necessary to shift the data at the bit level in order to provide the correct next data for the decoding or encoding operation. These bit level manipulations on the data stream can require costly and/or slow hardware and software. Furthermore, prior art systems require this shifting to be done in time critical feedback loops that limit the performance of the decoder. It would be advantageous to able to allow the bit level manipulation of the data stream to be performed either at the encoder or the decoder for applications where only one of the two operations is cost and/or speed critical. It would also be advantageous to remove the bit level manipulation of the data stream from time critical feedback loops, so that parallelization could be used to increase speed.
The present invention provides a lossless compression and decompression system. The present invention also provides a simple decoder which provides fast decoding. The present invention also provides a decoding system which decodes data in parallel and in a pipelined manner. The present invention also provides for decoding a binary entropy coded data stream without having to perform the bit level manipulations of the prior art in the critical feedback loops. The present invention also provides interleaving data for use by multiple coders (lossless or lossy) without using overhead for markers.