In the ongoing standardization tracks based on H.265/High Efficiency Video Coding (HEVC) while providing backward compatibility, i.e. scalable video coding and a Three-Dimensional Video (3DV) coding including MV-HEVC (HEVC multi-view video coding extension framework) and 3D-HEVC (3D High Efficiency Video Coding), a unified common high-level structural design is adopted. The unified structural design is based on a multi-layer video coding, which introduces a concept of “layer” to represent texture components and depth components of the MV-HEVC and the 3D-HEVC and different scalable layers of the scalable video coding, and to indicate different views and scalable layers by means of layer identifiers (Layer Ids). A currently issued H.265/HEVC standard is referred to as H.265/HEVC Version 1 standard.
In multi-layer video coding, video pictures obtained at the same time instant and corresponding coding bits constitute an Access Unit (AU). In the same AU, each layer of pictures may use different coding methods. In such a way, in the same AU, a certain layer of pictures may be an Intra Random Access Point (IRAP) picture servable as a random point, and one or more of pictures on other layers are common inter-frame and inter-layer predicted coding pictures. In practical application, different layers may select respective IRAP picture insertion policies according to a network transmission situation, a video content changing and the like. For example, a shorter IRAP picture insertion period may be adopted for video pictures on a Base Layer (BL) compatible with H.265/HEVC, and a relatively longer IRAP picture insertion period may be adopted for a video picture on an Enhancement Layer (EL). In such a way, by means of a layer-wise accessed multi-layer video coding structure, random access performance of a multi-layer video coding bitstream may be ensured without sharp increment in coding bit-rate.
A BL bitstream in the multi-layer video coding bitstream should be compatible with the H.265/HEVC Version 1 standard. That is, the multi-layer video coding bitstream should ensure that a decoder designed according to the H.265/HEVC Version 1 standard can correctly decode the BL bitstream extracted from the multi-layer video coding bitstream. Specifically, for the MV-HEVC and the 3D-HEVC, the BL corresponds to a base view or an independent view, and the EL corresponds to an enhancement view or a dependent view. In practical application, a base view bitstream only used for being played by a traditional two-dimensional television, a dual-view bitstream supporting three-dimensional display and a multi-view bitstream for three-dimensional display may be obtained by means of a method for extracting the multi-layer video coding bitstream.
In the H.265/HEVC Version 1 standard, there are three types of IRAP pictures, namely an Instantaneous Decoding Refresh (IDR) picture, a Broken Link Access (BLA) picture and a Clean Random Access (CRA) picture. The three pictures are coded in an intra coding mode and decoded without depending on other pictures. The three picture types are different in operations on Picture Order Count (POC) and Decoded Picture Buffer (DPB).
The POC is an order count configured to identify a picture displaying order in H.265/HEVC Version 1. According to the H.265/HEVC Version 1 standard, a POC value of a picture is composed of two parts. If the POC value of the picture is represented by PicOrderCntVal, PicOrderCntVal=PicOrderCntMsb+PicOrderCntLsb, where PicOrderCntMsb represents a Most Significant Bit (MSB) value of the POC value of the picture, and PicOrderCntLsb represents a Least Significant Bit (LSB) value of the POC value of the picture. Generally, the PicOrderCntMsb value is equal to a PicOrderCntMsb value of a previous picture (TemporalId=0) with respect to a current picture according to a decoding order, and the PicOrderCntLsb value is equal to a value of a slice_pic_order_cnt_lsb field in slice header information. The number of bits for representing slice_pic_order_cnt_lsb field in the bitstream is signalled by log 2_max_pic_order_cnt_lsb_minus4 in a Sequence Parameter Set (SPS), and the number of bits for representing slice_pic_order_cnt_lsb in the bitstream is determined as log 2_max_pic_order_cnt_lsb_minus4+4.
In the H.265/HEVC Version 1, if a current picture is an IDR picture, a PicOrderCntMsb value is set as 0, slice header information does not contain a slice_pic_order_cnt_lsb field, and a PicOrderCntLsb value defaults to 0. If a current picture is a BLA picture, a PicOrderCntMsb value is set as 0, slice header information contains a slice_pic_order_cnt_lsb field configured to determine a PicOrderCntLsb value. If a current picture is a CRA picture and a flag bit HandleCraAsBlaFlag value is equal to 0, the POC is calculated by means of a general method; and if a current picture is a CRA picture and a flag bit HandleCraAsBlaFlag value is equal to 1, the POC value of the CRA picture is calculated by means of a BLA picture method.
It is important to note that in multi-layer video coding standard, the slice header information of the EL always contains a slice_pic_order_cnt_lsb field regardless of corresponding picture type.
On this basis, for the multi-layer video coding bitstream, in order to ensure that pictures at the same time can be detected in DPB detection process and to make it convenient for a decoder to determine a start/end position of each AU in the bitstream using the POC value, it is required that all pictures in the AUs have the same POC value. For a layer-wise coding structure, each AU probably contains IRAP pictures and non-IRAP pictures at the same time. In such a way, if the IRAP pictures are an IDR picture and a BLA picture, the POC values of the pictures contained in this AU will be different. As a result, it is necessary to design a POC alignment function for the multi-layer video coding standard so as to meet that all pictures in the AUs may have the same POC when using the layer-wise structure.
In order to solve the problem, a POC alignment method is proposed in a JCT-VC standard conference proposal JCTVC-N0244. The method refers to adding a poc_reset_flag field of which the length is 1 bit by using a reserved bit in slice header information. When a value of the field is equal to 1, a POC value of a picture is calculated in accordance with a general method, then a so-called POC shifting operation is performed as subtracting the calculated POC value from the POC values of the pictures in the same layer (including BL) in DPB, and finally the POC value of the picture is set to be 0.
The main defect of the method is that a BL bitstream cannot be compatible with the H.265/HEVC Version 1 standard, that is, it cannot be ensured that the decoder conforming to the H.265/HEVC Version 1 standard can decode the BL bitstream extracted from the multi-layer video coding bitstream.
In order to solve the problem of computability, it was proposed in JCT-VC conference proposals JCTVC-O0140 and JCTVC-O0213 that only an MSB in the POC is set as 0 when it is needed to perform POC alignment on the basis of the JCTVC-N0244. Furthermore, an option of delaying POC alignment is added in the JCTVC-O0213 to deal with application scenario of losing a slice with a flag bit indicating a reset of POC value and application scenario of different frame rates among layers. It was proposed in JCTVC-O0176 that POC alignment is directly performed in the case of an IDR picture without adoption of an explicit signalling by a flag bit in slice header, while adding a reserved bit into slice header of an IDR picture in BL bitstream so as to be configured to calculate a POC value if the picture is a CRA picture rather than an IDR picture. The calculated POC value is configured for performing a POC shifting operation on pictures stored in DPB of ELs. It was proposed in JCTVC-O0275 with a concept of a layer POC, which maintains two sets of different POCs for the pictures on the EL. The layer POC is a POC value obtained under the condition that POC alignment is not performed, and the value of layer POC is configured for relevant operations of a decoding algorithm for Reference Picture Set (RPS) and the like. The other set of POC is the POC subjected to POC alignment. The aligned POC value is consistent with that of a picture on the BL in the same AU, and the aligned POC value is configured to control picture output and displaying processes. According to a method proposed by the proposal JCTVC-O0275, information of the BL is adopted in a POC alignment process, a variable flag maintained inside an encoder/decoder is configured to trigger the POC alignment process, and a value of the flag is associated with picture type information of the BL.
The above method has the defects as follows.
Original POC value of each picture in DPB will be changed due to a POC shifting operation performed on the picture in the POC alignment process of the current picture. Consequently, when a slice containing POC alignment information is lost, POC values of the pictures in the DPB cannot be correctly shifted, which leads to a problem that correct reference pictures cannot be derived for subsequent pictures. Because of wrongly shifted POC values, the pictures which have been already correctly decoded and stored in the DPB will be marked as “wrong decoded pictures”.
In the case that frame rates are different among layers, when an AU contains an IDR picture of a BL but does not contain an EL picture, the POC shifting operation will not be executed at this EL in POC alignment process, which leads to a problem that picture outputting process cannot be performed correctly. This problem causes that a delayed POC alignment operation in JCTVC-O0213 cannot ensure correctly decoding and outputting of a multi-layer video coding bitstream.
In the case that layer POC is employed, it is necessary to maintain two sets of different POC systems. However, in these two sets of POC systems, after the POC alignment operation, differences between POCs of any two pictures are equal, that is, redundant information exists in the two sets of POC systems.
In the above methods, both JCTVC-O0140 and JCTVC-O0213 need to use BL reserved bits, and the multi-layer video codec needs to execute different operations from the ones specifies in H.265/HEVC Version 1 standard in processing BL stream according to the value of the reserved bits. Therefore, it blocks a quick and convenient design of re-using an existing product solution conforming to H.265/HEVC Version 1 standard in developing multi-layer video codec implementations. Although bit information is added in slice layer as slice header extension without changing BL decoding process according to a JCTVC-O0176 method, slice layer extension information is byte aligned, which inevitably brings extra bit overheads to a slice header.
In order to correctly execute the POC alignment operation, it is necessary to present quite a number of restrictions on the coding structure in use. For example, JCTVC-O0176 requires that an EL picture must exist in an AU containing an IDR picture in BL. Such restrictions reduce the flexibility of an application employing multi-layer video coding, for example, particularly an uncoordinated simulcast.
An effective solution has not been proposed yet currently for the above problem in the related art.