1. Field of the Invention
The present invention relates to signal processing and, in particular, to computer-implemented processes and apparatuses for signal processing with hybrid variable-length and entropy encoding.
2. Description of the Related Art
This invention relates to signal processing which is often utilized to compress signals into an encoded bitstream. The signals may represent, for example, video frames, or other types of data such as ASCII text strings. The portion of an encoded bitstream representing compressed data may be stored in a mass storage device in its compressed format in order to conserve storage space. When the compressed data is later retrieved it may be decompressed and, for example, displayed on a monitor or printed on a printer device.
When the data to be compressed or encoded comprise a set of symbols, the symbols are often not completely randomly distributed. Rather, there is often a statistical probability distribution amount the symbols constituting the symbol set. If the symbol set from which up to n possible symbols are drawn is denoted by S, then S={S.sub.0, S.sub.1, . . . , S.sub.n-1 }, where S.sub.i represents the symbol having the probability p.sub.i. In this usage, it is assumed that these probabilities are ordered in monotonically decreasing fashion so that, for example, S.sub.0 is the most probable symbol of S and S.sub.n-1 is the least probable symbol of S. It is also assumed in such usages that the occurrences of these symbols are uncorrelated, i.e. they come from a memoryless source.
There are currently two major methods of encoding these symbols: entropy encoding and variable-length coding (VLC). Entropy encoding methods are methods that approach the theoretical entropy limit in efficiency, such as arithmetic encoding. Arithmetic encoding approaches optimal encoding efficiency, but is complex to implement. VLC encoders such as Huffman encoders are simpler to implement but less efficient in some cases. Therefore, VLC is often used and usually works very well. In VLC techniques, at least one bit is needed in the bitstream to represent each symbol, whereas only fractions of bits may be required in some instances to represent some symbols in entropy encoding techniques. Such encoding techniques are described in the paper Ian H. Witten, Radford M. Neal, and John G. Cleary, "Arithmetic Coding for Data Compression," Communications of the ACM, Vol. 30, No. 6, pp. 520-540 (June 1987), the entirety of which is incorporated herein by reference. Further such techniques are described in William B. Pennebaker & Joan L. Mitchell, JPEG: Still Image Data Compression Standard (New York: Van Nostrand Reinhold, 1993), chapters 8 and 11-12 of which are incorporated herein by reference.
As explained above, in typical cases VLC encoding works very well, and produces an encoded bitstream that is nearly as small, within a few percent, as the encoded bitstream that would be produced by utilizing the more complex technique of entropy encoding. However, when one of the symbols of a symbol set (i.e. S.sub.0) is dominant, e.g. the probability of S.sub.0 is much larger than 0.5, then VLC is significantly less efficient than entropy encoding. That is, in this case, the average number of bits per symbol required in the encoded bitstream is significantly higher for VLC than for entropy encoding. The reason for this result lies in the fact that in VLC, at least one bit is needed to represent even the most dominant symbol S.sub.0, whereas only a fraction of a bit is needed to represent the dominant symbol in entropy encoding when its probability p.sub.0 is &gt;0.5.
For example, assume that a symbol set S is approximately distributed in accordance with a geometric series of probability, of the form p.sub.i =(1-r)*r.sup.i, where r=0.1, so that p.sub.0 =0.9, p.sub.1 =0.09, p.sub.2 =0.009, p.sub.3 =0.0009, . . . where p.sub.i is the probability of symbol S.sub.i of S. In this case, the average number of bits per symbol for a Huffman type of VLC scheme is 1.111, whereas that for entropy encoding is only 0.521. (However, where no symbol is dominant, Huffman and entropy encoding produce more similar results. For example, assume that S is approximately distributed in accordance with another geometric series of probability, so that p.sub.0 =0.1, p.sub.1 =0.09, p.sub.2 =0.081, p.sub.3 =0.0729, . . . . In this case the average number of bits per symbol for Huffman encoding is 4.725, and that for entropy encoding is 4.690.)
There is thus a need for methods and apparatuses for processing signals that retain the simplicity of VLC techniques while minimizing the VLC encoding inefficiency.