Data compression is a useful tool for storing and transmitting large amounts of data. For example, the time required to transmit an image, such as a facsimile transmission of a document, is reduced when compression is used to decrease the number of bits required to recreate the image.
Many different data compression techniques exist in the prior art. Compression techniques can be divided into two broad categories, lossy coding and lossless coding. Lossy coding involves coding that results in the loss of information, such that there is no guarantee of perfect reconstruction of the original data. The goal of lossy compression is that changes to the original data are done in such a way that they are not objectionable or detectable. In lossless compression, all the information is retained and the data is compressed in a manner which allows for perfect reconstruction.
Golomb coding is a lossless coding technique. In Golomb coding, Less Probable Symbols (LPSs) are represented by a `0` followed by the number of preceding More Probable Symbols (MPSs) in binary. MPS's are grouped together and represented by a `1.` The number of MPS's together that are represented by a "1" varies between the different Golomb codes, which are identified by a Golomb parameter, and is related to the probability of an LPS. This is an optimal entropy coding of Bernoulli (or other stationary) source data.
Since optimal entropy codes are incompressible, it follows that the `0`s and `1`s in the Golomb code output should occur equally often. The initial bit in the Golomb output (referred to herein as the "Stone-age symbol") maps an LPS to `0` and maps an MPS to `1` so an excess of 0's indicates that the LPS is more probable than expected, and vice versa.
Adapting between different codes has been taught in the prior art. An example may be shown in U.S. Pat. No. 5,381,145, entitled "Method and Apparatus for Parallel Decoding and Encoding of Data" (Allen et al.), issued Jan. 10, 1995, and assigned to the corporate assignee of the present invention. Allen discloses a probability estimation table and bit stream generator for an R coder. An R coder uses R codes to code data. R codes are adaptive codes which include Golomb runlength codes. The probability estimation state table includes a state counter and a code associated with each of the separate states in the table. The coder is initially in state zero and after each codeword is processed, the state counter is incremented or decremented depending on the first bit of the codeword. A codeword of zero increases the magnitude of the state counter, while a codeword starting with 1 decreases the magnitude of the state counter. Therefore, every codeword causes a change to be made by the state counter.
Generally, one of the disadvantages of this arrangement is that the speed with which the Golomb parameter ramps is fixed, although alternative embodiments would allow the speed to be downloaded based on experience with training sets. This is disadvantageous in that the data may not allow the coder to use a more optimal code sooner in the coding process, thus resulting in less or worse compression. Also, the optimal speed may vary often over time, particularly due to differences in the information being coded (e.g., color facsimile statistics will be different than text portraits and backgrounds).
Allen does state that the coder may start in one of the states, and the increments and decrements do not have to be made according to a fixed number. Instead, the probability estimate can be incremented by a variable number according to the amount of data already encountered or the amount of change in the data (stability). However, the speed at which the coder switches between using different codes is not made based on whether use of the entropy codes is optimal.
The present invention provides for adaptive coding with adaptive speed that is neither fixed nor downloaded, but rather is set adaptively and is made based on whether use of the entropy codes is optimal. This adaptive quality may allow for retuning in real time to new optimal speeds for a variety of data types.