Field of the Invention
The present invention relates to video coding. In particular, the present invention relates to video coding techniques associated with loop filtering and processing across slice or tile boundaries.
Description of the Related Art
Motion estimation is an effective inter-frame coding technique to exploit temporal redundancy in video sequences. Motion-compensated inter-frame coding has been widely used in various international video coding standards. The motion estimation adopted in various coding standards is often a block-based technique, where motion information such as coding mode and motion vector is determined for each macroblock or similar block configuration. In addition, intra-coding is also adaptively applied, where the picture is processed without reference to any other picture. The inter-predicted or intra-predicted residues are usually further processed by transformation, quantization, and entropy coding to generate a compressed video bitstream. During the encoding process, coding artifacts are introduced, particularly in the quantization process. In order to alleviate the coding artifacts, additional processing can be applied to reconstructed video to enhance picture quality in newer coding systems. The additional processing is often configured in an in-loop operation so that the encoder and the decoder may derive the same reference pictures.
FIG. 1 illustrates an exemplary adaptive inter/intra video coding system incorporating in-loop filtering process. For inter-prediction, Motion Estimation (ME)/Motion Compensation (MC) 112 is used to provide prediction data based on video data from other picture or pictures. Switch 114 selects Intra Prediction 110 or inter-prediction data from ME/MC 112 and the selected prediction data is supplied to Adder 116 to form prediction errors, also called prediction residues or residues. The prediction error is then processed by Transformation (T) 118 followed by Quantization (Q) 120. The transformed and quantized residues are then coded by Entropy Encoder 122 to form a video bitstream corresponding to the compressed video data. The bitstream associated with the transform coefficients is then packed with side information such as motion, mode, and other information associated with the image unit. The side information may also be processed by entropy coding to reduce required bandwidth. Accordingly, the side information data is also provided to Entropy Encoder 122 as shown in FIG. 1 (the motion/mode paths to Entropy Encoder 122 are not shown). When the inter-prediction mode is used, a previously reconstructed reference picture or pictures have to be used to form prediction residues. Therefore, a reconstruction loop is used to generate reconstructed pictures at the encoder end. Consequently, the transformed and quantized residues are processed by Inverse Quantization (IQ) 124 and Inverse Transformation (IT) 126 to recover the processed residues. The processed residues are then added back to prediction data 136 by Reconstruction (REC) 128 to reconstruct the video data. The reconstructed video data may be stored in Reference Picture Buffer 134 and be used for prediction of other frames.
As shown in FIG. 1, incoming video data undergoes a series of processing in the encoding system. The reconstructed video data from REC 128 may be subject to various impairments due to the series of processing. Accordingly, various loop processing is applied to the reconstructed video data before the reconstructed video data is used as prediction data in order to improve video quality. In the High Efficiency Video Coding (HEVC) standard being developed, Deblocking Filter (DF) 130, Sample Adaptive Offset (SAO) 131 and Adaptive Loop Filter (ALF) 132 have been developed to enhance picture quality. The Deblocking Filter (DF) 130 is applied to boundary pixels and the DF processing is dependent on the underlying pixel data and coding information associated with the corresponding blocks. There is no DF-specific side information needs to be incorporated in the video bitstream. On the other hand, the SAO and ALF processing are adaptive, where filter information such as filter parameters and filter type may be dynamically changed according to the underlying video data. Therefore, filter information associated with SAO and ALF is incorporated in the video bitstream so that a decoder can properly recover the required information. Furthermore, filter information from SAO and ALF is provided to Entropy Encoder 122 for incorporation into the bitstream. In FIG. 1, DF 130 is applied to the reconstructed video first; SAO 131 is then applied to DF-processed video; and ALF 132 is applied to SAO-processed video. However, the processing order among DF, SAO and ALF may be re-arranged. In the High Efficiency Video Coding (HEVC) video standard being developed, the loop filtering process includes DF and SAO.
The coding process in HEVC is applied to each Largest Coding Unit (LCU). The LCU is adaptively partitioned into coding units using quadtree. Therefore, the LCU is also called coding tree block (CTB). In each leaf CU, DF is performed for each 8×8 block and in HEVC Test Model Version 5.0 (HM-5.0), the DF is applied to the 8×8 block boundaries. For each 8×8 block, horizontal filtering across vertical block boundaries is first applied, and then vertical filtering across horizontal block boundaries is applied.
Sample Adaptive Offset (SAO) 131 is also adopted in HM-5.0, as shown in FIG. 1. SAO is regarded as a special case of filtering where the processing only applies to one pixel. To apply SAO, a picture may be divided into multiple LCU-aligned regions. Each region can select one SAO type among two Band Offset (BO) types, four Edge Offset (EO) types, and no processing (OFF). For each to-be-processed (also called to-be-filtered) pixel, BO uses the pixel intensity to classify the pixel into a band. The pixel intensity range is equally divided into 32 bands, as shown in FIG. 2. Four consecutive bands are grouped together, where the starting band is indicated by sao_band_position. An exemplary 4-band group 200 is illustrated in FIG. 2. The first band position of this 4-band group is indicated by arrow 210. In EO, pixel classification is first done to classify pixels into different groups (also called categories or classes). The pixel classification for each pixel is based on a 3×3 window, as shown in FIG. 3 where four configurations corresponding to 0°, 90°, 135°, and 45° are used for classification. Upon classification of all pixels in a picture or a region, one offset is derived and transmitted for each group of pixels. In HM-5.0, SAO is applied to luma and chroma components, and each of the luma components is independently processed. Similar to BO, one offset is derived for all pixels of each category except for category 4 of EO, where Category 4 is forced to use zero offset. Table 1 below lists the EO pixel classification, where “C” denotes the pixel to be classified.
TABLE 1CategoryCondition0C < two neighbors1C < one neighbor && C == one neighbor2C > one neighbor && C == one neighbor3C > two neighbors4None of the above
Adaptive Loop Filtering (ALF) 132 is another in-loop filtering in HM-5.0 to enhance picture quality, as shown in FIG. 1. Multiple types of luma filter footprints and chroma filter footprints are used. The ALF operation is applied in the horizontal direction first. After horizontal ALF is performed, ALF is applied in the vertical direction. In HM-5.0, up to sixteen luma ALF filters and at most one chroma ALF filter can be used for each picture. In order to allow localization of ALF, there are two modes for luma pixels to select filters. One is a Region-based Adaptation (RA) mode, and the other is a Block-based Adaptation (BA) mode. In addition to the RA and BA for adaptation mode selection at picture level, Coding Units (CUs) larger than a threshold can be further controlled by filter usage flags to enable or disable ALF operations locally. As for the chroma components, since they are relatively flat, no local adaptation is used in HM-5.0, and the two chroma components of a picture share the same filter. In MH-5.0, an ALF filter for a region may be selected from multiple ALF filters. In addition, multiple filter footprints are used in HM-5.0. For each ALF filter, there is a set of filter coefficients associated with the filter. Therefore, the ALF information comprises identification for the selected ALF filter, the filter footprint and filter coefficients.
As shown in FIG. 1, DF 130 is applied to reconstructed pixels from REC 128. SAO 131 is then applied to DF-processed pixels and ALF 132 is applied to SAO-processed pixels. While the processing sequence illustrated in FIG. 1 is DF, SAO and ALF, other processing sequence may also be used. For example, SAO may be applied to reconstructed pixels from REC 128, DF-processed reconstructed pixels (i.e., DF applied to the reconstructed pixels), ALF-processed reconstructed pixels (i.e., ALF applied to reconstructed pixels), both DF-processed and ALF-processed pixels (i.e., DF applied to the reconstructed pixels and ALF applied to the DF-processed reconstructed pixels) or both ALF-processed and DF-processed pixels (i.e., ALF applied to the reconstructed pixels and DF applied to the ALF-processed reconstructed pixels). For convenience, the “processed-reconstructed pixels” may refer to any type of the processed pixels mentioned above during SAO processing. The “processed-reconstructed pixels” also includes the reconstructed pixels from REC 128. In this case, it can be considered that a null processing is applied to the reconstructed pixels from REC 128. Similarly, the “processed-reconstructed pixels” may also refer to various types of the processed pixels by DF, SAO, both DF and SAO or both SAO and DF during ALF processing. Again, for ALF processing, the “processed-reconstructed pixels” also includes the reconstructed pixels from REC 128.
To reduce side-information associated with SAO processing, SAO information of a current LCU can reuse the SAO information of a neighboring LCU above or to the left of the current LCU. The SAO information sharing is indicated by merge syntax. In HM-8.0, SAO syntax consists of sao_merge_left_flag, sao_merge_up_flag, sao_type_idx_luma, sao_type_index_chroma, sao_eo_class_luma, sao_eo_class_chroma, sao_band_position, sao_offset_abs, and sao_offset_sign, as shown in Table 2. Syntax sao_merge_left_flag indicates whether the current LCU reuses the SAO parameters of the left LCU. Syntax sao_merge_up_flag indicates whether the current LCU reuses the SAO parameters of the upper LCU. Syntax sao_type_idx represents the selected SAO type (sao_type_idx_luma and sao_type_idx_chroma for luma component and chroma component respectively). Syntax sao_offset_abs represents the offset magnitude and syntax sao_offset_sign represents the offset sign. Syntax cIdx indicates one of three color components. Similar mechanism can also be used to allow neighboring blocks to share the same ALF information.
TABLE 2Descriptorsao( rx, ry ){if( rx > 0 ) {leftCtbInSliceSeg = CtbAddrInSliceSeg > 0leftCtbInTile = ( TileId[ CtbAddrInTS ]= = TileId[ CtbAddrRStoTS[ CtbAddrInRS − 1 ] ] )if( leftCtbInSliceSeg && leftCtbInTile )sao_merge_left_flagae(v)}if( ry > 0 && !sao_merge_left_flag ) {upCtbInSliceSeg = ( CtbAddrInRS − PicWidthInCtbsY ) >= slice_segment_addressupCtbInTile = ( TileId[ CtbAddrInTS ] = =TileId[ CtbAddrRStoTS[ CtbAddrInRS − PicWidthInCtbsY ] ] )if( upCtbInSliceSeg && upCtbInTile )sao_merge_up_flagae(v)}if( !sao_merge_up_flag && !sao_merge_left_flag ) {for( cIdx = 0; cIdx < 3; cIdx++ ) {if( ( slice_sao_luma_flag && cIdx = = 0 ) | |( slice_sao_chroma_flag && cIdx > 0 ) ) {if( cIdx = = 0 )sao_type_idx_lumaae(v)else if( cIdx = = 1 )sao_type_idx_chromaae(v)if( SaoTypeIdx[ cIdx ][ rx ][ ry ] != 0 ) {for( i = 0; i < 4; i++ )sao_offset_abs[ cIdx ][ rx][ ry ][ i ]ae(v)if( SaoTypeIdx[ cIdx ][ rx ][ ry ] = = 1 ) {for( i = 0; i < 4; i++ )if( sao_offset_abs[ cIdx ][ rx ][ ry ][ i ] != 0 )sao_offset_sign[ cIdx ][ rx ][ ry ][ i ]ae(v)sao_band_position[ cIdx ][ rx ][ry ]ae(v)} else {if( cIdx = = 0 )sao_eo_class_lumaae(v)if( cIdx = = 1 )sao_eo_class_chromaae(v)}}}}}}
The LCUs in a picture can be partitioned into slices, where each slice consists of multiple horizontally consecutive LCUs. In HM-5.0, another image unit structure, named tile, is introduced, where a picture is partitioned into multiple tiles. For example, a picture may be divided into M tiles horizontally and N tiles vertically, where M and N are integers greater than 0. Each tile consists of multiple LCUs. Within each tile, the processing sequence of the LCUs is according to the raster scan order. Within each picture, the processing sequence of the tiles is also according to the raster scan order. Tile boundaries are often aligned with LCU boundaries.
In some systems, it is desirable to process the slices or tiles independently. Independent slice/tile processing will allow parallel processing of multiple slices or tiles. For CTBs or LCUs located at a left boundary or a top boundary of the slice or tile, SAO or ALF parameter sharing with a neighboring LCU above or to the left of the current LCU implies data dependency on an LCU from another slice or tile. Therefore, it is desirable to develop SAO or ALF processing that enables slice/tile independent processing.