Many decoders (and encoders) receive (and encoders provide) encoded data for blocks of an image. Typically, the image is divided into blocks and each of the blocks is encoded in some manner, such as using a discrete cosine transform (DCT), and provided to the decoder. A block may denote a rectangular region in an image and consist of pixels, for example a 16×16 block is a region 16 pixels in width by 16 pixels in height. The decoder receives the encoded blocks and decodes each of the blocks in some manner, such as using an inverse DCT.
Video coding standards, such as MPEG-4 part 10 (H.264), compress video data for transmission over a channel with limited bandwidth and/or limited storage capacity. These video coding standards include multiple coding stages such as intra prediction, transform from spatial domain to frequency domain, inverse transform from frequency domain to spatial domain, quantization, entropy coding, motion estimation, and motion compensation, in order to more effectively encode and decode frames.
The Joint Collaborative Team on Video Coding (JCT-VC) of the International Telecommunication Union Telecommunication Standardization Sector (ITU-T) Study Group 16 (SG16) Working Party 3 (WP3) and International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) Joint Technical Committee 1/Subcommittee 29/Working Group 11 (JTC1/SC29/WG11) has launched a standardization effort for a video coding standard called the High Efficiency Video Coding standard (HEVC). Similar to some prior video coding standards, HEVC uses block-based coding. An example of a known HEVC encoder is shown in FIG. 1. HEVC decoders are also known.
In HEVC, Context-Adaptive Binary Arithmetic Coding (CABAC) is used to compress Transformed and Quantized Coefficients (TQCs) without loss. The TQCs are determined at the encoder by processing image blocks with a forward transform to generate transform coefficients that are then quantized using an operation that maps multiple transform coefficient values to TQCs values. The TQCs values are then communicated to the decoder as coefficient level values, or level values, and the level value for each coefficient is then mapped to a transform coefficient value that is similar, but not necessarily identical, to the transform coefficient value computed at the encoder. CABAC based encoding and/or decoding technique is generally context adaptive which refers to (i) adaptively coding symbols based on the values of previous symbols encoded and/or decoded in the past, and (ii) context, which identifies the set of symbols encoded and/or decoded in the past used for adaptation. The past symbols may be located in spatial and/or temporal adjacent blocks. In many cases, the context is based upon symbol values of neighboring blocks.
As mentioned above, CABAC may be used to compress TQCs without loss. By way of background, TQCs may be from different block sizes according to transform sizes (e.g., 4×4, 8×8, 16×16, 32×32, 16×32). Two-dimensional (2D) TQCs may be converted into a one-dimensional (1D) array before entropy coding, for example, CABAC. In an example, 2D arrayed TQCs in a 4×4 block may be arranged as illustrated in Table (1).
TABLE 1401032−1. . .−30. . .. . .0. . .. . .. . .
When converting the 2D TQCs into a 1D array, the block may be scanned in a diagonal zig-zag fashion. Continuing with the example, the 2D arrayed TQCs illustrated in Table (1) may be converted into 1D arrayed TQCs [4, 0, 3, −3, 2, 1, 0, −1, 0, 0, . . . ] by scanning the first row and first column, first row and second column, second row and first column, third row and first column, second row and second column, first row and third column, first row and fourth column, second row and third column, third row and second column, fourth row and first column and so on.
The 1D array of TQCs is represented by a sequence of Syntax Elements (SEs) in CABAC. An example of the sequence of SEs for the example 1D array of TQCs is shown in FIG. 2. The SEs represent the following parameters: Last position X/Y, Significance Map, and the attributes Greater than 1, Greater than 2, Sign Information, and Absolute-3. The last position X/Y represents the position (X/Y) of the last non-zero coefficient in the corresponding block. Significance map represents the significance of each coefficient, whether the coefficient level is zero. Greater than 1 indicates whether the coefficient amplitude (absolute coefficient level) is larger than one for each non-zero coefficient (i.e. with significant flag (map) as 1). Greater than 2 indicates whether the coefficient amplitude is larger than two for each coefficient with amplitude larger than one (i.e. with greater than 1 flag as 1).
In CABAC in HEVC, the representative SEs are coded. FIG. 3 shows the CABAC framework used for coding SEs. The CABAC coding technique includes coding symbols using stages. In the first stage, the CABAC uses a “binarizer” to map input symbols to a string of binary symbols, or “bins”. The input symbol may be a non-binary valued symbol that is binarized or otherwise converted into a string of binary (1 or 0) symbols prior to being coded into bits. The bins can be coded into bits using either a “bypass encoding engine” or a “regular encoding engine”.
For the regular encoding engine in CABAC, in the second stage a probability model is selected. The probability model is used to arithmetic encode one or more bins of the binarized input symbols. This model may be selected from a list of available probability models depending on the context, which is a function of recently encoded symbols. The probability model stores the probability of a bin being “1” or “0”. In the third stage, an arithmetic encoder encodes each bin according to the selected probability model. There are two sub-ranges for each bin, corresponding to a “0” and a “1”. The fourth stage involves updating the probability model. The selected probability model is updated based on the actual encoded bin value (e.g., if the bin value was a “1”, the frequency count of the “1” s is increased). The decoding technique for CABAC decoding reverses the process.
For the bypass encoding engine in CABAC, the second stage involves conversion of bins to bits omitting the computationally expensive context estimation and probability update stages. The bypass encoding engine assumes an equal probability distribution for the input bins. The decoding technique for CABAC decoding reverses the process.
The CABAC encodes the symbols conceptually using two steps. In the first step, the CABAC performs a binarization of the input symbols to bins. In the second step, the CABAC performs a conversion of the bins to bits using either the bypass encoding engine or the regular encoding engine. The resulting encoded bit values are provided in the bitstream to a decoder.
The CABAC decodes the symbols conceptually using two steps. In the first step, the CABAC uses either the bypass decoding engine or the regular decoding engine to convert the input bits to bin values. In the second step, the CABAC performs de-binarization to recover the transmitted symbol value for the bin values. The recovered symbol may be non-binary in nature. The recovered symbol value is used in remaining aspects of the decoder.
As previously described, the encoding and/or decoding process of the CABAC includes at least two different modes of operation. In a first mode, the probability model is updated based upon the actual coded bin value, generally referred to as a “regular coding mode”. The regular coding mode requires several sequential serial operations together with its associated computational complexity and significant time to complete. In a second mode, the probability model is not updated based upon the actual coded bin value, generally referred to as a “bypass coding mode”. In the second mode, there is no probability model (other than perhaps a fixed probability) for decoding the bins, and accordingly there is no need to update the probability model.
When utilizing CABAC coding in HEVC, throughput performance can differ depending on different factors such as but not limited to: total number of bins/pixels, number of bypass bins/pixels, and number of regular (or context) coded bins/pixels. Throughput is defined as the amount of TQCs that can be decoded (or encoded) in a unit of time. Generally speaking, throughput for the case of high bit-rate encoding (low Quantization Parameter (QP) value) is significantly less than throughput in other cases. Therefore, throughput in high bit-rate cases may consume a significant amount of processing resources and/or may take a significant amount of time to encode/decode. The disclosure that follows solves this and other problems.
It is also known that CABAC can be used in a lossless coding mode to compress a residual sample. In one example, a residual sample is a value corresponding to a specific location in an image. Typically, a residual sample corresponds to the difference between a value corresponding to a specific location in an image and a prediction value corresponding to the same, specific location in an image. Alternatively, a residual sample is a value corresponding to a specific location in an image that has not been processed with a transformation operation, or a transformation operation that is not typically used to create TQCs. A residual sample can be from different block sizes according to its sample size (4×4, 8×8, 16×16, 32×32, 16×32, etc.) A 2D residual sample block is first converted into a 1D array before entropy coding, similar to TQC encoding. In an example, 2D arrayed residual sample in a 4×4 block may be arranged as illustrated in Table (2).
TABLE 2401032−1. . .−30. . .. . .0. . .. . .. . .
When converting the 2D residual sample into a 1D array, the block may be scanned in a diagonal zig-zag fashion. Continuing with the example, the 2D arrayed residual sample illustrated in Table (2) may be converted into 1D arrayed residual sample [4, 0, 3, −3, 2, 1, 0, −1, 0, 0, . . . ] by scanning the first row and first column, first row and second column, second row and first column, third row and first column, second row and second column, first row and third column, first row and fourth column, second row and third column, third row and second column, fourth row and first column and so on.
The 1D array of the residual sample is represented by a sequence of Syntax Elements (SEs) in CABAC. An example of a sequence of SEs for the example 1D array of the residual sample is shown in FIG. 2. The SEs represent the following parameters: Last position X/Y, Significance Map, and the attributes Greater than 1, Greater than 2, Sign Information, and Absolute-3.
In the lossless coding mode of CABAC in HEVC, the representative SEs are coded. The CABAC framework of FIG. 3 may be used for coding the SEs. The CABAC coding technique includes coding symbols using stages. In the first stage, the CABAC uses a “binarizer” to map input symbols to a string of binary symbols, or “bins”. The input symbol may be a non-binary valued symbol that is binarized or otherwise converted into a string of binary (1 or 0) symbols prior to being coded into bits. The bins can be coded into bits using the previously described “regular encoding engine”.
For the regular encoding engine in the lossless coding mode of CABAC, in the second stage a probability model (also known as a “context model” in the lossless encoding mode of CABAC) is selected. The context model is used to arithmetic encode one or more bins of the binarized input symbols. This context model may be selected from a list of available context models depending on the context, which is a function of recently encoded symbols. The context model stores the probability of a bin being “1” or “0”. In the third stage, an arithmetic encoder encodes each bin according to the selected context model. There are two sub-ranges for each bin, corresponding to a “0” and a “1”. The fourth stage involves updating the corresponding context model. The selected context model is updated based on the actual encoded bin value (e.g., if the bin value was a “1”, the frequency count of the “1” s is increased). The decoding technique for CABAC decoding reverses the process.
The number of context models used as described in the previous paragraph may be 184. Specifically: 36 context models used for Last position X/Y (18 context models for Last_position_X, 18 context models for Last_position_Y); 48 context models used for Significance Map (4×4 block: 9 luma, 6 chroma; 8×8 block: 11 luma, 11 chroma; 16×16 or 32×32 block: 7 luma, 4 chroma); and 100 context models used for the attributes Greater than 1, Greater than 2, Sign Information, and Absolute-3 (Greater_than_1 flag of luma: 30; Greater_than_1 flag of chroma: 20, Greater_than_2 flag of luma: 30; and Greater_than_2 flag of chroma: 20).
(Coding Absolute-3 Coefficients of the Syntax Element)
Referring back to syntax element coding, a portion of the syntax element coding involves coding the Absolute-3 coefficients of the syntax element. Coding the Absolute-3 coefficients involves Golomb-Rice (GR) coding and 0th order Exponential-Golomb (EG0) coding, as will be explained in more detail below.
FIG. 4 illustrates coding structure for Absolute-3 for HEVC.
By way of example, consider the Absolute-3 values shown in FIG. 4, namely, 81, 34, 6, 4, 0. Coding is in reverse relative to scanning order. The value “0” is converted using the Golomb-Rice (G-R) code table illustrated in FIG. 5, which shows five Variable-Length Code (VLC) tables (denoted as with the columns labeled 0-4, where column labeled 0 is VLC table 0, column labeled 1 is VLC table 1, . . . , column labeled 4 is VLC table 4), each corresponding to a different Rice parameter value. Based on a current Rice parameter of zero (Rice parameter is initialized at zero for the initial value of a sub-block in HEVC), VLC table 0 is activated. In the VLC table 0, for the input value of “0”, the codeword is “0”. Therefore, the corresponding codeword value is “0”.
Before proceeding to the next scanning position, there is a check for a Rice parameter update. A Rice parameter update table is illustrated in FIG. 6. Because the input symbol level is “0”, and current Rice parameter zero is “0”, a lookup result is zero, the same as the current Rice parameter value, and hence there is no Rice parameter update. A Rice parameter update determines the Rice parameter value used to code a next value.
The next value “4” is converted using VLC table 0 of the G-R code table of FIG. 5 to codeword 11110. According to the update table illustrated in FIG. 6, the current Rice parameter is updated to one before converting the next value “6” to a codeword. Following the conversion of the value “6” to a codeword, the Rice parameter is updated to two according to FIG. 6.
Moving to scanning position two, it can be seen that the value “34” to convert is larger than the Rice code range corresponding to the current Rice parameter two. Specifically, the five Rice parameter values shown in FIG. 5 have, respectively, the following Rice code ranges: 7, 14, 26, 46, 78. The range corresponds to the largest symbol value with a defined codeword and not equal to Ser. No. 11/111,111 for each Rice parameter. Again, the value “34” is larger than the corresponding range 26. Therefore, according to HEVC, a codeword corresponding to the value 27 is selected using the VLC table 2 of FIG. 5.
Also, EG0 coding is used to encode the residual value that is not represented by the corresponding codeword (11111111 for value 27) from G-R coding. The input value of the residual, namely 8 (34-26), is used to select a prefix and suffix from the EG0 table of FIG. 7. Here, the selected prefix from the EG0 process is 1110 and the selected suffix from the EG0 process is 001. The Rice parameter is updated to four based on the value 26 (4 is the lookup result for the largest value 23) and according to FIG. 6. Both the codeword 11111111 (from G-R coding) and the codeword 1110001 (from EG0 coding) are used to represent the value “34”. In an example, the codeword from G-R coding and the codeword from EG0 coding are concatenated to form a single codeword.
When utilizing CABAC coding in HEVC, throughput performance can differ depending on different factors such as but not limited to the magnitude of the Absolute-3 values of a syntax element to be coded. Therefore, depending on these factors, coding may consume a significant amount of processing resources and/or may take a significant amount of time. The disclosure that follows solves this and other problems.