The arithmetic coding is known as an efficient data compressing method and is widely used in coding standards, such as JBIG, JPEG2000, H.264/AVC, and High-Efficiency Video Coding (HEVC). In H.264/AVC JVT Test Model (JM) and HEVC Test Model (HM), Context-Based Adaptive Binary Arithmetic Coding (CABAC) is adopted as the entropy coding tool for various syntax elements in the video coding system.
FIG. 1 illustrates an example of CABAC encoder 100 which includes three parts: Binarization 110, Context Modeling 120, and Binary Arithmetic Coding (BAC) 130. In the binarization step, each syntax element is uniquely mapped into a binary string (also called bin or bins in this disclosure). In the context modeling step, a probability model is selected for each bin. The corresponding probability model may depend on previously encoded syntax elements, bin indexes, side information, or any combination of the above. After the binarization and the context model assignment, a bin value along with its associated context model is provided to the binary arithmetic coding engine, i.e., the BAC 130 block in FIG. 1. The bin value can be coded in two coding modes depending on the syntax element and bin indexes, where one is the regular coding mode, and the other is the bypass mode. The bins corresponding to regular coding mode are referred to as regular bins and the bins corresponding to bypass coding mode are referred to as bypass bins in this disclosure. In the regular coding mode, the probability of the Most Probable Symbol (MPS) and the probability of the Least Probable Symbol (LPS) for BAC are derived from the associated context model. In the bypass coding mode, the probability of the MPS and the LPS are equal. In CABAC, the bypass mode is introduced to speed up the encoding process.
High-Efficiency Video Coding (HEVC) is a new international video coding standard that is being developed by the Joint Collaborative Team on Video Coding (JCT-VC). HEVC is based on the hybrid block-based motion-compensated DCT-like transform coding architecture. The basic unit for compression, termed Coding Unit (CU), is a 2N×2N square block, and each CU can be recursively split into four smaller CUs until a predefined minimum size is reached. Each CU contains one or several variable-block-sized Prediction Unit(s) (PUs) and Transform Unit(s) (TUs). For each PU, either intra-picture or inter-picture prediction is selected. Each TU is processed by a spatial block transformation and the transform coefficients for the TU are then quantized. The smallest TU size allowed for HEVC is 4×4.
In HEVC Test Model Version 5.0 (HM-5.0), the transform coefficients are coded TU by TU. For each TU, syntax elements last_significant_coeff_x and last_significant_coeff_y are transmitted to indicate the last non-zero coefficient horizontal and vertical positions respectively according to a selected scanning order. A TU is divided into multiple subsets for the TUs having size larger than 4×4. For an 8×8 TU, the 64 coefficients are divided into 4 subsets according to the diagonal scanning order through the entire 8×8 TU as shown in FIG. 2. The scanning through the transform coefficients will convert the two-dimensional data into a one-dimensional data. Each subset contains 16 continuous coefficients of the diagonally scanned coefficients. For TUs having size larger than 8×8(e.g. 16×16, 32×32) and non-square TUs (e.g. 16×4, 4×16, 32×8, 8×32), the TUs are divided into 4×4sub-blocks. Each sub-block corresponds to a coefficient sub-set. For each sub-block (i.e. each subset), the significance map, which is represented by significant_coeff_flag [x,y] , is coded first. Variable x is the horizontal position of the coefficient within the sub-block and the value of x is from 0to (sub-block width −1). Variable y is the vertical position of the coefficient within the sub-block and the value of y is from 0 to (sub-block height −1). The flag, significant_coeff_flag[x,y] indicates whether the corresponding coefficient of the TU is zero or non-zero. For convenience, the index [x,y] is omitted from significant_coeff_flag[x, y]. For each non-zero coefficient as indicated by significant_coeff_flag, the level and sign of the non-zero coefficient is represented by coeff_abs_level_greater1_flag, coeff abs_level_greater2_flag, coeff_abs_level_minus3, and coeff_sign_flag.
In HM-5.0, if the TU size is equal to 16×16, 32×32, 16×4, 4×16, 32×8, or 8×32, one significant_coeffgroup_flag is coded for each sub-block prior to the coding of level and sign of the sub-block (e.g. the significant_coeff_flag, coeff_abs_level_greater1_flag, coeff_abs_level_greater2_flag, coeff_ab_level_minus3, and coeff_sign_flag). If significant coeffgroup flag is equal to 0, it indicates that the entire 4×4 sub-block is zero. Therefore, there is no need for any additional information to represent this sub-block. Accordingly, the coding of level and sign of sub-block can be skipped. If significant_coeffgroup_flag is equal to 1, it indicates that at least one coefficient in the 4×4 sub-block is non-zero. The level and sign of each non-zero coefficient in the sub-block will be coded after the significant_coeffgroup_flag. The value of significant coeff_flag is inferred as 1 for the sub-block containing the DC term (i.e., the transform coefficient with the lowest spatial frequency).
In HM-5.0, significant_coeff_flag is coded in regular CABAC mode with context modeling. Different context selection methods are used for different TU sizes. For TUs with size of 4×4 or 8×8, the context selection is based on the position of the coefficient within the TU. FIG. 3 shows the position-based context selection map for a 4×4 TU and FIG. 4 shows the position-based context selection map for an 8×8 TU as adopted in HM-5.0. In FIG. 3, significance map 310 is used for the luma component and significance map 320 is used for the chroma component, where each number corresponds to a context selection. In FIG. 4, luma and chroma 8×8 TUs share the same significance map.
For other TU sizes, the neighboring-information-dependent context selection is adopted. FIGS. 5A and 5B illustrate examples of the neighboring-information-dependent context selection for luma and chroma components respectively. One context is used for the DC coefficient. For non-DC coefficients (i.e., AC coefficients), the context selection depends on the neighboring coefficients. For example, a group of neighboring non-zero coefficients including I, H, F, E, and B around a current coefficient X are used for the context selection. If none of the neighboring pixels is non-zero, context #0 is used for coefficient X. If one or two of the neighboring pixels are non-zero, context #1 is used for X. Otherwise context #2 is used for coefficient X.
In the above neighboring-information-dependent context selection, the non-DC coefficients of the entire TU are divided into two regions (i.e., region-1 and region-2) for the luma component and one region (region-2) for the chroma component. Different regions will use different context sets. Each context set includes three contexts (i.e., context #0, #1, and #2). The area of region-1 for the luma component can be mathematically specified by the x-position and y-position of a coefficient X within the TU. As shown in FIG. 5A, if the sum of x-position and y-position of coefficient X is smaller than a threshold value and greater than 0, region-1 context set is selected for coefficient X. Otherwise, region-2 context set is selected. The threshold value can be determined based on the width and the height of the TU. For example, the threshold can be set to a quarter of the maximum value of the TU width and the TU height. Accordingly, in the case of TU sizes 32×32, 32×8 or 8×32, the threshold value can be set to 8.
In HM-5.0, for TUs with sizes other than 4×4 and 8×8, the TUs will be divided into 4×4 sub-blocks for coefficient map coding. However, the criterion of region-1/region-2 context selection depends on the x-position and y-position of the transform coefficient. Therefore, some sub-blocks may cross the boundary between region-1 and region-2 and two context sets will be required for these sub-blocks. FIG. 6A illustrates an example where one 4×4 sub-block 610 (the center of the sub-block is indicated by a dot) for 16×16 TU 621, 16×4 622, and 4×16 TU 623 will use two context sets for significant coeff flag coding. FIG. 6B illustrates an example where three 4×4 sub-blocks 631 to 633 for 32×32 TU 641, 32×8 TU 642, and 8×32 TU 643 will use two context sets for significant_coeff_flag coding. For sub-blocks 632 and 633, the sum of x-potion and y-position of coefficient X has to be calculated in order to determine whether the coefficient X is in region-1 or region-2. For the sub-block containing the DC term, i.e., sub-block 631, the position of the DC term is known and all other coefficients in the sub-block belong to region-1. Therefore, significant_coeffgroup_flag can be inferred and there is no need to calculate the sum of x-position and y-position. For other sub-blocks, there is no need to calculate the sum of x-position and y-position of coefficient X since all coefficients of other sub-blocks are in region-2 and one context set for significant_coeff_flag coding is used.
Therefore, it is desirable to simplify the context selection process, such as to eliminate the requirement of calculating the sum of x-position and y-position of coefficient or to eliminate other operations.