This invention relates to entropy encoding and decoding of data and may be applied e.g. in applications for compression of video data, images and/or sound data.
Entropy encoding is a form of lossless data compression. Lossless compression is an attempt of transforming discrete data into a lower number of bits than in source data without any loss of information. Lossless compression is the final step in lossy encoding of multimedia data when a source image, video data or sound data is first converted by lossy methods, and then obtained data is additionally compressed by entropy compression.
Let's suppose that there is a bit string, and for every bit thereof a probability of the bit to be in a certain state pi may be calculated. From the theory of information encoding it follows that the optimal method of encoding of a state of the current bit is encoding of this state by a bit code with a length L equal to an information entropy of a given event:L=−log2 pi[1].  (A1)
There exist various methods of such transformation, and it is considered that arithmetic encoding [4] and range encoding [3] are most effective. Thus, the task of optimal encoding may be reduced to the task of optimal prediction of the probability of the given state of a coded bit to appear.
In the process of lossy compression, source data are transformed by a plurality of methods, such as, for example, discrete cosine transform, Fourier transform, a motion compensation between frames, switching of block sizes, data quantization [5]. Also, the SIF-transformation (steerable interpolation and filtration), as described in [6], may be used for data compression. For correct decoding of obtained data said data should be written into an output data stream in a strictly defined way. This method of data transfer to an output data stream is uniquely set by a compression method and is called as a syntax of a stream. An individual number, as transferred to the output data stream, is called a syntactic element [5].
FIG. 3 shows a typical layout diagram of a known video encoder. Source data is transformed by several different lossy methods. In the result, heterogeneous data is received at an input of a syntactic encoder (305), said heterogeneous data being, e.g., encoder control data (301), quantized transformation results (302), intraframe prediction data (303), motion vector data (304). The syntactic encoder (305) transforms said heterogeneous data into a single data stream of syntactic elements (306) which is transferred to one or more entropy coders (307) that perform entropy encoding and further transfer compressed data to an output bitstream (308).
FIG. 4 shows a typical decoder of compressed data. An input bitstream (401) is transferred to an input of one or more entropy decoders (402) that transfer decoded data (403) to an input of a syntactic decoder (404). The syntactic decoder selects heterogeneous data from the input bitstream (401), such as decoder control data (405), intraframe transformation data (406), quantized transformation results (407), motion vector data (408). The heterogeneous decoded data thus obtained is used for further decoding of video data.
Syntactic elements may have a various number of states set in the data stream syntax. If the number of states for a current syntactic element is more than 2, then, prior to entropy encoding, said current syntactic element should be transformed into a binary string. This transformation process is called binarization, and it may be performed by a plurality of methods [1], [2], [5], [13]. The efficiency of lossless compression and computational complexity of the obtained compression method depends directly on the binarization methods used.
One of the binarization methods is unary encoding [1] (see FIG. 1), wherein a positive integer (101) is binarized by a direct (corresponds to (102)) or inverse (corresponds to (103)) method into a respective number of bits, or a respective bit string (102, 103), equal to a value of said integer. This method enables to fully indentify dependencies in encoded data, but slows down operation of a codec.
Another method is binarization with the use of a universal code [1], such as the exponential-Golomb code (exp-Golomb code) [1], wherein a source data stream comprising a zero value (104) is transformed into a bit string (105) of variable length. Another possible universal code is the Elias gamma code [1] that may transform natural numbers (104) into a bit string (106) of variable length. An advantage of binarization with the use of the universal codes is easy software implementation and rather high efficiency of operation of the codec, but binarization with the use of the universal codes may not be optimal for several data types.
There exists a method of quasi-optimal binarization of syntactic elements which significantly accelerates an entropy codec for a preselected typical data set (FIG. 2). According to this method, a probability (202) is calculated for every state of a syntactic element (201) with the use of the typical data set. Then, for all possible states of an encoded syntactic element a Huffman tree (203) is built [1], and variable length (204) [1] codes are calculated, which uniquely set rules of bypassing the tree built. A disadvantage of this method is complexity of its implementation and its nonoptimality for encoding data having another statistical distribution than that of the typical data set.
Usually, a syntactic element set to be encoded with the use of entropy compression comprises complex interdependencies. If these dependencies are considered optimally then the probability of every state of the syntactic element in the formula (A1) may be predicted more accurately, and, due to this, an increase in compression efficiency may be achieved. Furthermore, a binarization method used directly influences quality of probability prediction and general compression efficiency.
A method for improving the probability prediction is known, wherein a probability of occurrence of a certain state is predicted depending on a value of symbols decoded so far. This methodology is called the context statistical modeling [7], [8]. A context model stores in its cells a probability distribution of a coded state depending on some number called “context” (FIG. 5, 6). A number of cells in the context model should not be less than a number of possible context states.
When encoding, one cell is selected from a context model (502) with the use of a value of context data (501). A probability stored in a selected cell (503) is used for encoding of a current bit value in input data (504) by an entropy coder (505). Coded data as output data (506) is transferred to an output data stream. After encoding, the probability in the selected cell is updated by a model updater (507) with due regard to the coded data bit value [7], [8], [9].
When decoding, one cell is selected from a context model (602) with the use of a value of context data (601). A probability stored in a selected cell (603) is used for decoding data obtained from input data (604) by an entropy decoder (605). Decoded data as output data (606) is transferred to an output data stream. After decoding, the probability in the selected cell is updated by a model updater (607) with due regard to the decoded bit value [7], [8], [9].
The context may be calculated by various methods from earlier decoded data and a location of the decoded bit in the data stream [9], [10], [11], [12], [13], [14], [15]. In order to improve compression efficiency, when calculating the context it is necessary to use such data that is most correlated with the coded state of the syntactic element. Generally, the more context size, the higher the probability evaluation accuracy is. However, if a context model is too large and the amount of data is small, statistics for each cell of the context model has not enough time to get accumulated, and the quality of prediction decreases. Therefore, there is a clear association between an optimal size of the context and the amount of the coded data. For improving the compression efficiency, an adaptive switching between the context models of different size may be used [16], [17].
The closest analogs to the present invention are technical solutions described in [16], [17]. According to said known technical solutions, switching between the context models of different size may be performed either when an encoded data stream achieves a certain bit rate, or when an advantage of changing over to the context model of a larger size is calculated. Switching of the context models when the encoded data stream has achieved the certain bit rate is inefficient, since individual statistics of the coded contexts is not accounted for. An optimal threshold for such switching differs in different cells and depends on statistics accumulated in the context models. In an improved variant of the solutions according to [16], [17], in the processes of encoding and decoding it is necessary to continuously check an advantage from using the context models of various sizes. Furthermore, additional computational operations are required both when encoding and decoding, and such check may be performed for the earlier encoded data rather than for the currently encoded data (since a value of the latter is not known to a decoder), which does not always leads to optimal solutions.
According to the present invention, the criteria of switching between the context models are fully different from those used in [16], [17], which enables to individually switch the type of the context model used for every context and thereby reduce computational complexity both for encoder and decoder.