High Efficiency Video Coding (HEVC) is a new coding standard that has been developed in recent years. In the High Efficiency Video Coding (HEVC) system, the fixed-size macroblock of H.264/AVC is replaced by a flexible block, named coding unit (CU). Pixels in the CU share the same coding parameters to improve coding efficiency. A CU may begin with a largest CU (LCU), which is also referred as coded tree unit (CTU) in HEVC. In addition to the concept of coding unit, the concept of prediction unit (PU) is also introduced in HEVC. Once the splitting of CU hierarchical tree is done, each leaf CU is further split into one or more prediction units (PUs) according to prediction type and PU partition.
In the current development of screen content coding for High Efficiency Video Coding (HEVC) standard, some tools have been adopted due to their improvements in coding efficiency for screen contents. For Intra blocks, Intra prediction according to the conventional approach is performed using prediction based on reconstructed pixels from neighboring blocks. Intra prediction may select an Intra Mode from a set of Intra Modes, which include a vertical mode, horizontal mode and various angular prediction modes. For HEVC screen content coding, a new Intra coding mode, named Intra-block copy (IntraBC) has been used.
Intra block copy (IBC) uses reconstructed samples in the current picture before in-loop filter as a reference picture for prediction. This un-filtered picture needs to be stored in addition to the filtered picture after in-loop filter. To store the reconstructed samples before in-loop filter, additional memory and the memory bandwidth is required for reading and writing, respectively. In the case that all reconstructed samples before in-loop filter may be used as the reference for IBC prediction, the whole reconstructed picture before in-loop filter needs to be stored. Hence, both the reconstructed current pictures before in-loop filter and after in-loop filter need to be stored for IBC prediction and r for temporal prediction respectively. Therefore, Intra block copy memory access causes increased memory bandwidth. In addition, it also causes additional decoding picture buffer (DPB).
In order to store the additional reconstructed samples before in-loop filter, Section 8: general decoding process of High Efficiency Video Coding (HEVC) Screen Content Coding (SCC): Draft 3 (Joshi, et al., HEVC Screen Content Coding Draft Text 3, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 20th Meeting: Geneva, CH, 10-18 Feb. 2015, Document: JCTVC-T1005) is modified so that a picture storage buffer in the decoded picture buffer (DPB) is allocated for the current picture. In HEVC SCC: Draft 3, the reconstructed current picture is marked as “used for long-term reference” in case that curr_pic_as_ref_enabled_flag is equal to 1. When curr_pic_as_ref_enabled_flag is equal to 1, the decoded sample values of the reconstructed current picture before in-loop filtering are stored into the picture storage buffer allocated for the current picture. After completing the decoding of all slices, the entire current decoded picture after in-loop filter is stored in the picture storage buffer allocated for the current picture and is marked as “used for short-term reference”. In HEVC and the HEVC SCC: Draft 3, the decoded pictures are managed by operation of the decoded picture buffer in Annex C.3 of HEVC SCC: Draft 3, which consists of a set of ordered processes including removal of pictures from the decoded picture buffer (DPB) (i.e., subclause C.3.2), picture output i.e., subclause C.3.3), current decoded picture marking and storage (i.e., subclause C.3.4). However the operation of DPB is not modified appropriately for the scenario that the reconstructed current picture may be used as a reference picture as discussed follows.
The operation of decoded picture buffer specifies when each of the operations happens so that DPB fullness can be controlled appropriately and the DPB fullness does not exceed the DPB maximum size limitation. In the subclause of removal of pictures from the DPB, for each picture that is removed from the DPB, the DPB fullness is decremented by one. In the subclause of current decoded picture marking and storage, the current decoded picture is stored in the DPB in an empty picture storage buffer and the DPB fullness is incremented by one. However, storing of the current picture before in-loop filter and updating of the DPB fullness are not specified in the operation of the DPB, in Annex C.3. Accordingly, the operation of DPB cannot be appropriately managed.
As mentioned in the previous paragraph, there is an issue of introducing additional memory bandwidth for reading and writing the reconstructed samples before in-loop filter to the memory. In the Core Experiment 2 (CE2): Intra block copy memory access in JCTVC-T1102 (Rapaka, et al., Description of Core Experiment 2 (CE2): Intra block copy memory access, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 20th Meeting: Geneva, CH, 10-18 Feb. 2015, Document: JCTVC-T1102), the memory bandwidth reduction is evaluated in order to avoid causing one additional picture memory buffer and bandwidth for reading and writing reconstructed samples before in-loop filter in addition to the samples after in-loop filter. The reconstructed samples before in-loop filter are also referred as unfiltered reconstructed samples in this disclosure. The reconstructed samples after in-loop filter are also referred as filtered reconstructed samples in this disclosure.
Two categories of works are under progress to address this issue. In the first category, JCTVC-S0145 (Rapaka, et al., Bandwidth reduction method for intra block copy, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 19th Meeting: Strasbourg, FR, 17-24 Oct. 2014, Document: JCTVC-S0145) discloses a method to reduce average bandwidth when Intra block copy (IBC) mode is used for prediction. JCTVC-S0145 is based on the observation that not all previously unfiltered decoded samples of the current picture are used for prediction in IBC mode. In JCTVC-S0145, the method indicates which of the previously decoded coded tree blocks (CTBs) are used for IBC prediction. According to JCTVC-S0145, a flag is sent for each block in a slice header, which CTBs need to be stored for reducing average bandwidth when a picture parameter set (PPS) level flag indicating the presence of the former flags in slice header. However, the method of JCTVC-S0145 has been questioned regarding know how many CTBs are needed in a slice and how many flags are needed to send in advance. Furthermore, DPB management method is not included in JCTVC-S0145. It is not clear whether the reconstructed samples before in-loop filter are stored in memory buffer in the DPB or outside DPB.
In the second category, JCTVC-T0045 (Lainema, et al., AHG10: Memory bandwidth reduction for intra block copy, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 20th Meeting: Geneva, CH, 10-18 Feb. 2015, Document: JCTVC-T0045) and JCTVC-T0051 (Laroche, et al., AHG10: On IBC memory reduction, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 20th Meeting: Geneva, CH, 10-18 Feb. 2015, Document: JCTVC-T0051) disclose a method to store either reconstructed samples before in-loop filter or after in-loop filter by indicating which CTBs are used for IBC prediction to address the issue of increased bandwidth and storage related IBC. The general interest is to keep the memory bandwidth of IBC under the limit as specified HEVC Inter mode. It means only one reconstructed picture buffer can be used for each picture even in the scenario of IBC. JCTVC-T0051 also discloses a method to store either reconstructed samples before in-loop filter or after in-loop filter in a picture memory buffer, which are typically outside DPB. The operation of DPB in Annex C.3 of JCTVC-T1005 may be managed similar to that in HEVC version 1 as discussed follows. In addition, in JCTVC-T0051, the current and the previous CTBs are always considered as available for IBC prediction. Therefore, additional memory for storing the two additional CTBs is required in JCTVC-T0051.
If the memory buffer outside the DPB stores the reconstructed samples before in-loop filter, there will be a bitstream consistency issue in HEVC SCC Draft, which specifies that all reference pictures shall be present in the DPB when needed for prediction. The reason is that the current picture before in-loop filter is used as reference picture and inserted in a reference picture list as HEVC SCC: Draft 3.
In High Level Syntax, the current picture is placed after all short term reference pictures and all other long term reference pictures during the initialization of reference picture list construction. The related descriptions are listed below for List 0. Similar process can be applied for List 1.
At the beginning of the decoding process for each slice, the reference picture list RefPicList0 for P slices and, both reference picture lists RefPicList0 and RefPicList1 for B slices are derived as follows:
TABLE 1At the beginning of the decoding process for each slice, the reference picture listsRefPicList0 and RefPicList1 (used for B slices) are derived as follows:NumRpsCurrTempList0 is set to Max( num_ref_idx_10_active_minus1 + 1,NumPicTotalCurr ) and the list RefPicListTemp0 is constructed as follows:rIdx = 0while( rIdx < NumRpsCurrTempList0 ) {for( i = 0; i < NumPocStCurrBefore && rIdx < NumRpsCurrTempList0; rIdx++, i++ )RefPicListTemp0[ rIdx ] = RefPicSetStCurrBefore[ i ]for( i = 0; i < NumPocStCurrAfter && rIdx < NumRpsCurrTempList0; rIdx++, i++ )RefPicListTemp0[ rIdx ] = RefPicSetStCurrAfter[ i ]for( i = 0; i < NumPocLtCurr && rIdx < NumRpsCurrTempList0; rIdx++, i++ )RefPicListTemp0[ rIdx ] = RefPicSetLtCurr[ i ]if( curr_pic_as_ref_enabled_flag )RefPicListTemp0[ rIdx++ ] = currPic}
In Table 1, curr_pic_as_ref_enabled_flag equal to 1 specifies a picture referring to the SPS (sequence parameter set) may be included in a reference picture list of the picture itself curr_pic_as_ref_enabled_flag equal to 0 specifies that a picture referring to the SPS is never included in any reference picture list of the picture itself. When not present, the value of curr_pic_as_ref_enabled_flag is inferred to be equal to 0.
After the initialization, the reference picture list RefPicList0 is constructed as follows:                for(rIdx=0; rIdx<=num_ref_idx_10_active_minus1; rIdx++)                    RefPicList0[rIdx]=ref_pic_list_modification_flag_10?                            RefPicListTemp0[list_entry_10[rIdx] ]:RefPicListTemp0[rIdx]                                                
However, when the number of active reference pictures (i.e., num_ref_idx_10_active_minus1+1) is smaller than the number of reference pictures in the list (NumRpsCurrTempList0) associated with the RefPicListTemp0 array storing the current picture, the current picture may not be included in the active reference picture list.
In a coding system based on the existing HEVC, there is an issue associated with Decoded Picture Buffer (DPB) management for IntraBC. When IntraBC is used, the reconstructed portion of current picture may be used as a reference picture to predict current picture. This reference picture for IntraBC is referred as “the unfiltered version of current picture”. On the other hand, the version of current picture that will eventually go through filtering operations such as deblocking and SAO is referred to the filtered version of current picture.
A reference picture has to be in Decoded Picture Buffer (DPB) in order to be used by a current picture. The size of DPB is constrained to be MaxDpb Size, which is derived as shown in Table 2.
TABLE 2if( PicSizeInSamplesY <= ( MaxLumaPs >> 2 )MaxDpbSize = Min( 4 * maxDpbPicBuf, 16 )else if( PicSizeInSamplesY <= ( MaxLumaPs >> 1 )MaxDpbSize = Min( 2 * maxDpbPicBuf, 16 )else if( PicSizeInSamplesY <= ( ( 3 * MaxLumaPs ) >> 2 ) )MaxDpbSize = Min( ( 4 * maxDpbPicBuf) / 3, 16 )ElseMaxDpbSize = maxDpbPicBuf
In Table 2, MaxLumaPs is the maximum luma picture size and maxDpbPicBuf is the maximum DPB size, such as 6. However, there are some issues in the current DPB management operations when IntraBC is used.