This application is a divisional of U.S. patent application Ser. No. 12/324,810, filed Nov. 26, 2008, which is a divisional of U.S. patent application Ser. No. 10/769,403, filed Jan. 30, 2004, now U.S. Pat. No. 7,599,435, all of which are incorporated herein in their entirety by this reference thereto.
I. Technical Field of the Invention
The present invention is related to video frame coding and, in particular, to an arithmetic coding scheme using context assignment based on neighboring syntax elements.
II. Description of the Prior Art
Entropy coders map an input bit stream of binarizations of data values to an output bit stream, the output bit stream being compressed relative to the input bit stream, i.e., consisting of less bits than the input bit stream. This data compression is achieved by exploiting the redundancy in the information contained in the input bit stream.
Entropy coding is used in video coding applications. Natural camera-view video signals show non-stationary statistical behavior. The statistics of these signals largely depend on the video content and the acquisition process. Traditional concepts of video coding that rely on mapping from the video signal to a bit stream of variable length-coded syntax elements exploit some of the non-stationary characteristics but certainly not all of it. Moreover, higher-order statistical dependencies on a syntax element level are mostly neglected in existing video coding schemes. Designing an entropy coding scheme for video coder by taking into consideration these typical observed statistical properties, however, offer significant improvements in coding efficiency.
Entropy coding in today's hybrid block-based video coding standards such as MPEG-2 and MPEG-4 is generally based on fixed tables of variable length codes (VLC). For coding the residual data in these video coding standards, a block of transform coefficient levels is first mapped into a one-dimensional list using an inverse scanning pattern. This list of transform coefficient levels is then coded using a combination of run-length and variable length coding. The set of fixed VLC tables does not allow an adaptation to the actual symbol statistics, which may vary over space and time as well as for different source material and coding conditions. Finally, since there is a fixed assignment of VLC tables and syntax elements, existing inter-symbol redundancies cannot be exploited within these coding schemes.
It is known, that this deficiency of Huffman codes can be resolved by arithmetic codes. In arithmetic codes, each symbol is associated with a respective probability value, the probability values for all symbols defining a probability estimation. A code word is coded in an arithmetic code bit stream by dividing an actual probability interval on the basis of the probability estimation in several sub-intervals, each sub-interval being associated with a possible symbol, and reducing the actual probability interval to the sub-interval associated with the symbol of data value to be coded. The arithmetic code defines the resulting interval limits or some probability value inside the resulting probability interval.
As may be clear from the above, the compression effectiveness of an arithmetic coder strongly depends on the probability estimation as well as the symbols, which the probability estimation is defined on.
A special kind of context-based adaptive binary arithmetic coding, called CABAC, is employed in the H.264/AVC video coding standard. There was an option to use macroblock adaptive frame/field (MBAFF) coding for interlaced video sources. Macroblocks are units into which the pixel samples of a video frame are grouped. The macroblocks, in turn, are grouped into macroblock pairs. Each macroblock pair assumes a certain area of the video frame or picture. Furthermore, several macroblocks are grouped into slices. Slices that are coded in MBAFF coding mode can contain both, macroblocks coded in frame mode and macroblocks coded in field mode. When coded in frame mode, a macroblock pair is spatially sub-divided into a top and a bottom macroblock, the top and the bottom macroblock comprising both pixel samples captured at a first time instant and picture samples captured at the second time instant being different from the first time instant. When coded in field mode, the pixel samples of a macroblock pair are distributed to the top and the bottom macroblock of the macroblock pair in accordance with their capture time.
The introduction of MBAFF coding to the preceding stage as an alternative to PAFF (picture adaptive frame/field) coding where the decisions between frame and field coding are made for each frame as a hole, was motivated by the fact that if a frame consists of mixed regions where some regions are moving and others are not, it is typically more efficient to code the non-moving regions in frame mode and the moving regions in the field mode.
As mentioned above, in the H.264/AVC video coding standard, there is an option to use macroblock adaptive frame/field coding (MBAFF) for interlaced video sources. As turned out from the above considerations, in MBAFF, the pixel samples in a respective macroblock pair are distributed in different ways to the top end field macroblock, depending on the macroblock pair being frame or field coded. Thus, on the one hand, when MBAFF mode is active, the neighborhood between pixel samples of neighboring is somewhat complicated compared to the case of PAFF coding mode.
On the other hand, the CABAC entropy coding scheme tries to exploit statistical redundancies between the values of syntax elements of neighboring blocks. That is, for the coding of the individual binary decisions, i.e., bins, of several syntax elements, context variables are assigned depending on the values of syntax elements of neighboring blocks located to the left of and above the current block. In this document, the term “block” is used as collective term that can represent 4×4 luma or chroma blocks used for transform coding, 8×8 luma blocks used for specifying the coded block pattern, macroblocks, macroblock or sub-macroblock partitions used for motion description.
In the case of macroblock adaptive frame/field coding, while the neighborhoods that are used for CABAC are not clear since field and frame macroblocks can be mixed inside the picture or slice. In the solution to this problem that was included in older versions of the H.264/AVC, each macroblock pair was considered as frame macroblock pair for the purpose of context modeling in CABAC. However, with this concept, the coding efficiency could be degraded, since choosing neighboring blocks that do not adjoin to the current blocks affects the adaption of the conditional probability models.