The invention relates to the field of encoding/decoding a symbol and, more specifically, a method for encoding a symbol comprising a plurality of values, a method for decoding a symbol comprising a plurality of values and being encoded by one or more codewords, a method for transmitting a symbol from a transmitter to a receiver, a computer program for performing the method in accordance with the invention, an encoder, a decoder and a system for transmitting a symbol from a transmitter to a receiver. More specifically, embodiments of the invention relate to a new entropy encoding/decoding method which is based on Huffman coding and uses multi-dimensional codewords to take advantage of statistical dependencies between neighboring symbols and to adapt the codeword length better to symbol probabilities.
In the art various methods for coding signals are known for coding audio and video signals or are used for coding processes in a telecommunication environment. Also corresponding decoding approaches are known. For example, in the field of audio coding AAC/MP3 uses modified (or stacked) Huffman codes according to Henke, Robert, “Simulation eines Audiocodierverfahrens für professionelle Anwendungen”, Diplomarbeit, Friedrich-Alexander Universität Erlangen-Nürnberg, Erlangen 1992, Brandenburg, Karlheinz, Henke, Robert, “Near-Lossless Coding of High Quality Digital Audio: First Results”, ICASSP-93, IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Apr. 27-30, 1993, pages 193-196, and EP 0 393 526 A.
Huffman codes are used to encode the quantized spectral coefficients. Spectral coefficients can be obtained from a time domain signal by means of a filter bank or transformation. In state-of-the-art audio coding, typically an MDCT is used as transformation (MDCT=modified discrete cosine transformation). For quantization typically a scalar quantizer is used. In case Huffman codes are used to encode quantized spectral values, a single or multiple quantized spectral values are referred to as symbol. Symbols mapped to Huffman codes are restricted in the range of values to a largest absolute value (LAV), as is described by Huffman, D. A., “A Method for the Construction of Minimum-Redundancy Codes”, Proceedings of the IRE, September 1952, vol. 40, issue 9, pages 1098-1101. For example, in AAC coding in case a symbol exceeds the LAV the symbol is not mapped to a single codeword but to a sequence of two codewords. One of the codewords is the so-called “escape sequence” which signals the presence of an additional codeword. The second codeword is the so-called “terminating codeword”. On the decoder side the symbol can only be decoded using all of the codewords from the sequence, namely the escape codeword and the terminating codeword. The terminating codeword is typically run-length coded using a modified Golomb-Code and signals the difference between the largest absolute value and the value of the coded symbol. The dimensionality of symbols is restricted to a maximum of four, i.e. a maximum of four neighboring spectral coefficients are combined for one symbol. Thus, the dimensionality of a symbol indicates the number of values which are combined into one symbol for which then a codeword is determined for transmission to a decoder. The escape mechanism is used per spectral coefficient, not per symbol, i.e. in case one spectral coefficient exceeds the LAV and the rest of the spectral coefficients do not, an escape mechanism is used only for the spectral coefficient exceeding the LAV.
In the field of video coding in accordance with the ITU-T video coding specification ITU-T H.263 (01/2005) a combination of a one-dimensional Huffman coding (VLC=Variable Length Coding) and an escape mechanism is used. This mechanism is used to encode quantized DCT (DCT=discrete cosine transformation) coefficients in a similar manner as is done in audio coding approaches. In the field of telecommunications the ITU-T telefax specification (ITU-T Rec. T.4 (07/2003)) describes the use of modified Huffman codes, i.e. run-lengths are encoded using Huffman coding. In case a run-length exceeds the LAV a so-called “mark-up-code” is transmitted. By means of this mark-up-codes integer multiples of 64 can be represented. At run-lengths being greater than 63 the next smaller mark-up-code is transmitted. The difference to the original run-length is sent as terminating codeword.
The above described approaches of conventional technology based on Huffman coding restrict the dimensionality and the range of values for the symbol to keep memory requirements low. In addition, it is needed to keep the Huffman codebooks or codeword tables small so that the codewords comprise a length which does not exceed a predefined limit so that transmission of the codewords can be done in accordance with preset conditions. In case single values exceed the range of values escape mechanism are used for these single symbols.
By restricting the symbol dimensionality the codeword lengths are, in general, not optimal.
For binary Huffman coding, only symbol probabilities p of (½)n can be encoded optimally using Huffman codes, since the resulting codeword length 1 is restricted to an integer value. If H(p) is the symbol entropy, the following restriction applies: H(p)≦1<H(p)+1. The negative effects of this restriction can be alleviated by increasing the symbol dimensionality to N: 1/N·H(p)≦1<H(p)+1/N. However, especially for low data rates multi-dimensional symbols having a probability of more than 0.5 may occur and for such symbols the optimal symbol dimensionality would than be for example 16. However, a 16-dim table with four values per sub-symbol would need a memory to store 416=4294967296=232 codewords and codeword lengths which would have a big impact on memory requirements. Also, the codeword length would exceed for many of the codewords an acceptable range.
Multi symbol code words are beneficial if the symbols to be coded have statistical dependencies. Such statistical dependencies may result e.g. from the characteristics of the frequency transform and the analysis window used.
For two statistically independent symbols the conditional probability that b follows a is P(a|b)=P(a)·P(b) resulting in an optimal code length L(a|b)=L(a)+L(b) being the sum of the optimal code words of the single symbols, whereas for statistically dependent symbols the conditional probability will be different. For example, if the there is a high probability that symbol b follows symbol a then the conditional probability P(a|b)>P(a)·P(b) will be higher than for the statistically independent case and accordingly, the optimal code length L(a|b)<L(a)+L(b) will be shorter than the sum of the two independent optimal code word lengths L(a) and L(b).
The higher the dimensionality of the code book used, the higher the order of dependent probability P(a|b|c| . . . ) that can be captured.