The present invention relates to entropy encoding and decoding and may be used in applications such as, for example, video and audio compression.
Entropy coding, in general, can be considered as the most generic form of lossless data compression. Lossless compression aims to represent discrete data with fewer bits than needed for the original data representation but without any loss of information. Discrete data can be given in the form of text, graphics, images, video, audio, speech, facsimile, medical data, meteorological data, financial data, or any other form of digital data.
In entropy coding, the specific high-level characteristics of the underlying discrete data source are often neglected. Consequently, any data source is considered to be given as a sequence of source symbols that takes values in a given m-ary alphabet and that is characterized by a corresponding (discrete) probability distribution {p1, . . . , pm}. In these abstract settings, the lower bound of any entropy coding method in terms of expected codeword length in bits per symbol is given by the entropy
                    H        =                  -                                    ∑                              i                =                1                            m                        ⁢                                                  ⁢                                          p                i                            ⁢                              log                2                            ⁢                                                p                  i                                .                                                                        (                  A          ⁢                                          ⁢          1                )            
Huffman codes and arithmetic codes are well-known examples of practical codes capable of approximating the entropy limit (in a certain sense). For a fixed probability distribution, Huffman codes are relatively easy to construct. The most attractive property of Huffman codes is that its implementation can be efficiently realized by the use of variable-length code (VLC) tables. However, when dealing with time-varying source statistics, i.e., changing symbol probabilities, the adaptation of the Huffman code and its corresponding VLC tables is quite demanding, both in terms of algorithmic complexity as well as in terms of implementation costs. Also, in the case of having a dominant alphabet value with pk>0.5, the redundancy of the corresponding Huffman code (without using any alphabet extension such as run length coding) may be quite substantial. Another shortcoming of Huffman codes is given by the fact that in case of dealing with higher-order probability modeling, multiple sets of VLC tables may be necessitated. Arithmetic coding, on the other hand, while being substantially more complex than VLC, offers the advantage of a more consistent and adequate handling when coping with adaptive and higher-order probability modeling as well as with the case of highly skewed probability distributions. Actually, this characteristic basically results from the fact that arithmetic coding provides a mechanism, at least conceptually, to map any given value of probability estimate in a more or less direct way to a portion of the resulting codeword. Being provided with such an interface, arithmetic coding allows for a clean separation between the tasks of probability modeling and probability estimation, on the one hand, and the actual entropy coding, i.e., mapping of a symbols to codewords, on the other hand.
An alternative to arithmetic coding and VLC coding is PIPE coding. To be more precise, in PIPE coding, the unit interval is partitioned into a small set of disjoint probability intervals for pipelining the coding processing along the probability estimates of random symbol variables. According to this partitioning, an input sequence of discrete source symbols with arbitrary alphabet sizes may be mapped to a sequence of alphabet symbols and each of the alphabet symbols is assigned to one particular probability interval which is, in turn, encoded by an especially dedicated entropy encoding process. With each of the intervals being represented by a fixed probability, the probability interval partitioning entropy (PIPE) coding process may be based on the design and application of simple variable-to-variable length codes. The probability modeling can either be fixed or adaptive. However, while PIPE coding is significantly less complex than arithmetic coding, it still has a higher complexity than VLC coding.
Therefore, it would be favorable to have an entropy coding scheme at hand which enables to achieve a better tradeoff between coding complexity on the one hand and compression efficiency on the other hand, even when compared to PIPE coding which already combines advantages of both arithmetic coding and VLC coding.