1. Field of the Invention
The present invention relates to a system and method for context adaptive entropy decoding of transform coefficients in compressed video images.
2. Description of the Prior Art
There is an increasing reliance on video data in rich media applications running on devices or systems such as personal computers, wireless devices, surveillance systems, video conferencing system and set-top boxes. Video data compression system play a key role in increasing the efficiency of video data transmission. Video data is compressed or coded for transmission by taking advantage of the spatial redundancies within a given frame and the temporal redundancies between successive frames. Intraframe compression operates on single frames independently of other frames to exploit spatial redundancies within the frame, whereas interframe compression exploits both spatial and temporal redundancies.
Video compression systems exploit temporal redundancies using interframe prediction coding. Interframe coding is based on predicting the current source frame using the previously coded frame, and coding only the prediction error between the source frame and the predicted frame. Approximations are done in the prediction process, by assuming the motion is uniform across all pixels of each motion estimation block in each frame. It is noted that intercoding can be done for both uni-directional and bidirectional prediction. Transmission efficiencies are realised in intercoding by transmitting the prediction error, as the amount of information present in the prediction error is generally less than that in the actual pixel values. The resulting prediction residuals from inter coding are processed through a frequency domain transform and a quantizer that sets the values of the transform coefficients to discrete values within a pre-specified range. Further compression of the video information is realized by entropy coding the resulting quantized transform coefficients before transmission or storage of the encoded bit stream. The entropy coder is used to represent the resulting information from the quantizer, the motion vector information, and other encoder information using short code words to code the information with the highest likely probability of occurrence, and long code words to code the information with the least likely probability of occurrence. The general approach to code the information with the highest likely probability of occurrence using short code words and the information with the least likely probability of occurrence using long code words is referred to as Variable Length Coding.
Since the video data is transmitted or stored in the form of a compressed bitstream, a decoder is needed to decode the bitstream to reconstruct the video data. First the decoder performs entropy variable-length-decoding of the quantized coefficients, then performs inverse quantization and inverse transform operations to form the image difference pixel values. Finally the image difference values are added to the image prediction pixel values to form the final reconstructed image pixels values.
As an example of entropy coding and decoding of transform coefficients, consider the case of entropy coding/decoding specified in the H.264 video coding standard. In H.264, entropy decoding of coefficients is done on a 4xc3x974 block basis. As an illustrative example, consider the following 4xc3x974 block of quantized transform coefficients at the encoder.
The first step in the encoding process for the above 4xc3x974 quantized transform coefficients is to apply a zigzag scan to the above quantized transform coefficient block to produce a sequence of coefficients. The zigzag scan is performed according to the following diagram: 
The resulting series of coefficients is then 6, 0, 5, 0, xe2x88x924, 0, 0, 0, 3, 0, xe2x88x921, 0, 0, 0, 1, 0. The coefficients are typically grouped into (Run_before, Coefficient_level) pairs where Run_before is the number of consecutive zero coefficients preceding a non-zero coefficients in the resulting zigzag order from low frequency coefficients to high frequency coefficients, and Coefficient_level is the value of the non-zero coefficients. The resulting (Run_before, Coefficient_level) pairs are then (0,6) (1,5), (1,xe2x88x924), (3,3), (1,xe2x88x921) and (3,1). In H.264, the Run_before information is separated from the Coefficient_level information and each is placed in a separate sequence. The resulting Run_before and Coefficient_level sequences are then:
Run_before: 6, 5, xe2x88x924, 3,xe2x88x921, 1
Coefficient_level: 0, 1, 1, 3, 1, 3
The second step in the coding process is to encode the Run_before information and the Coefficient_level information to produce the bit stream corresponding to the original quantized 4xc3x974 transform coefficient data.
At the decoder side, the step in the entropy decoding process for the coded quantized transform coefficient data is to decode the bitstream generated by the encoder to produce the Run_before information and the Coefficient_level information. For the example discussed above, this first step in the decoding process results in the following two sequences:
Run_before: 6, 5, xe2x88x924, 3, xe2x88x921, 1
Coefficient_level: 0, 1, 1, 3, 1, 3
The second step in the decoding process is to use the zigzag scan order described above to recover the 4xc3x974 block of quantized transform coefficients based on the above two sequences.
When the video data is transmitted at medium to high bit rates, the bits used to represent Run_before and Coefficient_levels dominate the compressed bit stream. It is therefore desirable to compress the Run_before and Coefficient_level information in the most efficient way. In a typical variable-length decoding system, each Run_before symbol and Coefficient_level symbol would be associated with a unique variable-length codeword such that frequently-occurring symbols have shorter lengths and rarely-occurring symbols have longer lengths. However, since different types of video content and different bit rates usually lead to different statistics of the 4xc3x974 transform coefficient data, a fixed mapping of Run_before and Coefficient_level symbols and variable legth codewords may not always provide optimal entropy compression. To solve this problem, context-adaptive variable length coding (CAVLC) schemes were developed so that the entropy coding process can adapt to different data statistics and always produce good entropy compression.
One of the known prior art CAVLC methods is described in a document xe2x80x9cCommittee Draftxe2x80x9d by the Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG. The H.264 standard specifies the use of context-adaptive variable length coding (CAVLC) in order to entropy decode the quantized transform coefficients information. Briefly, the method decodes the Coefficient_levels and Run_before using multiple variable-length decoding tables where a table is selected to decode each symbol based on the context of previously decoded symbols. One important aspect of the method is that the Coefficient_level and Run_before sequences are decoded in backward order, i.e. from coefficients corresponding to high frequencies to coefficients corresponding to low frequencies. For the example described above, the original scan order (from low frequency to high frequency) is given by
Coefficient_level: 6, 5, xe2x88x924, 3, xe2x88x921, 1
Run_before: 0, 1, 1, 3, 1, 3
whereas the CAVLC bitstream""s order is given by:
Coefficient_level: 1, xe2x88x921, 3, xe2x88x924, 5, 6
Run_before: 3, 1, 3, 0, 1, 1
It was observed that the CAVLC ordering of the coefficient values the coefficient_level sequences often results in a number of coefficients at the beginning of the sequence with absolute value equal to 1. Consecutive coefficients starting with the first coefficient in the sequence and with absolute value of 1 are called trailing ones (T1s). At most 3 trailing ones could be considered. The presence of the T1s in the Coefficient_level sequence is used to further enhance the compression efficiency in the CAVLC method, as described in the sections below.
CAVLC decoding of transform coefficients is based on 4 main steps. In the first step, the total number of non-zero coefficients and the number of trailing ones (T1s) are decoded from the bit stream, where the total number of trailing ones indicates the number of consecutive Coefficient_levels with absolute values of one at the end of the Coefficient_level sequence within the last three Coefficient_levels.
In the second step, the sign bits of the trailing ones (up to 3) are decoded using 1 bit each. The sign bits are enough to decode the Coefficient_levels of the trailing ones.
In the third step, the rest of the Coefficient_levels are decoded using 5 different VLC tables with names Lev-VLC0, Lev-VLC1, Lev-VLC2, Lev-VLC3, and Lev-VLC4, and an adaptive table selection scheme. Each x in the tables below can take the value of either 0 or 1.
The third step is called the Coefficient_level decoding process. Accordingly, for the first Coefficient_level in the Coefficient_level sequence, a Lev-VLC table is selected based on the block type (inter-coded or intra-coded), quantization parameter (QP), and total number of non-zero coefficients. For the rest of the Coefficient_levels, a table is selected to decode each Coefficient_level based on the block type, quantization parameters, and the Coefficient_level of the previously decoded Coefficient_level. The exact algorithm is as follows:
If block is Inter-coded or (Intra-coded with QP greater than =21):
Decode the first coefficient after trailing ones with Lev-VLC0 table. Decode the
next coefficient with Lev-VLC1.
if previous Coefficient_level| greater than 3
Increase Lev-VLCN by one (up to Lev-VLC2)
If block is Intra-coded with QP less than 21:
if (number of coefficients greater than 10)
Decode the first coefficient after trailing ones with Lev-VLC1 table.
Decode the next coefficient with Lev-VLC2 table.
else
Decode the first coefficient after trailing ones with Lev-VLC0 table.
Decode the next coefficient with the Lev-VLC1 table.
if current table is Lev-VLC1 and |decoded Coefficient_level| greater than 3
use Lev-VLC2 for next Coefficient_level
if current table is greater than =Lev-VLC2 and |decoded Coefficient_level| greater than 5
Increase Lev-VLCN by one (up to Lev-VLC4)
In other words, the most recently decoded Coefficient_level is used to predict what the next coefficient level may be and the most appropriate VLC table is selected based on the prediction. When decoding the first Coefficient_level (after trailing ones) and the number of trailing ones is less than three,the decoded Coefficient_level is the received level plus one.
In the fourth step, first the sum of Run_before is decoded, then multiple tables are used to decode each Run_before.
The major disadvantage of the existing method is that its complexity is high. Notice that there are two discontinuities (19-bit and 28-bit escape code sequences) in each of the Lev-VLC tables. The two discontinuities correspond to conditional execution branching and creates complexity for both software and hardware implementations. Furthermore, depending on the current block coding mode, quantization parameter, and total number of coefficients, three separate logic paths or circuits (Intercoded blocks and Intracoded blocks with QP greater than =21; Intracoded blocks with QP less than 21 and more than 10 nonzero coefficients; Intracoded blocks with QP less than 21 and the number of nonzero coefficients less than or equal to 10) are required to implement the table selection process. The discontinuities in the Lev-VLC tables and the multiple logic paths in the table selection process introduce a relatively large number of conditional instructions or branches that can significantly reduce the amount of parallelism in a typical processor or circuit. When there are many coefficients to be decoded (at medium to high bit rates), this can cause a significant slowdown in the speed of a decoder. For most DSP platforms, it is important that there be minimal or no branches inside the entropy decoding loop so that a software pipelining schedule can be utilized to exploit the parallel processing power of the DSPs.
Further, existing context adaptive variable length coding compression systems also select the decoding table for the first coefficient level after the trailing ones based on whether the current block is Inter mode or Intra mode, as well as what quantization parameter was used. Both of these parameters are external to the entropy decoding module, and therefore introduce inefficient data dependencies and increased data loading times. Further, the existing systems use different processing of Inter mode blocks and Intra mode blocks, which can increase code size and function set-up time, further impacting processing speed and memory requirements.
It is an object of the present invention to provide an entropy decoding system and method to obviate or mitigate some of the above-presented disadvantages.
According to the present invention there is provided a Context Adaptive Variable Length Coding (CAVLC) system and method to decode Coefficient level information corresponding to quantized transform coefficients. The system and method include complexity-reduction improvements in the coefficient level decoding process, such as:
1. Simplified and extended the range of Lev-VLC tables. Specifically, the number of Lev-VLC tables is extended from 5 to 7 and only 1 escape code (28-bit escape code) is used for tables Lev-VLC1 to Lev-VLC6; and
2. Simplified and improved table selection process. The table selection for the first Coefficient_level depends only on number of non-zero coefficients and number of trailing ones which are local variables within the CAVLC module. The table selection process for subsequent Coefficient_levels has been re-designed in such a way that the same logic path can be used to select Lev-VLC table for the next coefficient regardless of block modes and quantization parameters.
According to a further aspect of the present invention there is provided a Context-Adaptive Variable Length Coding (CAVLC) system for decoding quantized transform coefficient levels. The system comprises: an input for a bitstream including context-adaptive variable-length-encoded Run_before and Coefficient_level data corresponding to quantized transform coefficients; an entropy decoding section for decoding the Run_before and Coefficient_level data; and a plurality of decoding tables used by the entropy decoding section for decoding the data, wherein at least two of the decoding tables have a single escape sequence and are generated by a common function.
According to a further aspect of the present invention there is provided a Context-Adaptive Variable Length Coding (CAVLC) method for decoding quantized transform coefficient levels. The method comprises the steps of: receiving a bitstream including context-adaptive variable-length-encoded Run_before and Coefficient_level data corresponding to quantized transform coefficients; accessing a plurality of decoding tables for decoding the data, wherein at least two of the decoding tables have a single escape sequence and are generated by a common function; and selecting one of the plurality of tables for decoding the Run_before and Coefficient_level data.
According to a still further aspect of the present invention there is provided a Context-Adaptive Variable Length Coding (CAVLC) system for decoding quantised transform coefficient levels. The system comprises: an input for a bitstream including context-adaptive variable-length-encoded Run_before and Coefficient_level data corresponding to quantized transform coefficients; an entropy decoding section for decoding the Run_before and Coefficient_level data; and a plurality of decoding tables used by the entropy coding section for decoding Coefficient_levels, at least two of the decoding tables have a single escape sequence and are generated by a common function; wherein selection from the plurality of decoding tables for the first Coefficient_level is determined solely by local variables representing a total number of non-zero coefficients and a number of trailing ones in the sequence of Coefficient_levels and selection from the plurality of decoding tables for subsequent Coefficient_levels is determined solely by a previous decoded coefficient_level and an experimentally pre-determined table.