In the ongoing standardization tracks based on H.265/High Efficiency Video Coding (HEVC) while providing backward compatibility, i.e. HEVC multi-view video coding extension framework (MV-HEVC) for Three-Dimensional Video (3DV) coding and Scalable High-efficiency Video Coding (SHVC), a unified common high-level structural design is adopted. The unified structural design is based on a concept of “multi-layer video coding”, which introduces a concept of “layer” to represent different views in MV-HEVC and different scalable layers in SHVC, and to indicate views and scalable layers by means of layer identifiers (LayerIds).
In a working draft of multi-layer video coding standard, resource requirements for decoding a bitstream are represented by using Profile, Tier and Level, namely (P, T, L). Meanwhile, in a session setup process, a decoder may express its own maximum decoding capability by using (P, T, L). When the maximum decoding capability of the decoder meets the resource requirements for decoding the bitstream, the decoder may correctly decode the bitstream. In a current working draft of multi-layer video coding standard, a (P, T, L) requirement corresponding to accumulated resources needed for decoding an entire multi-layer video coding bitstream is signalled, and (P, T, L) requirements corresponding to resources needed for decoding various layers are signalled. Herein, an identifier index value of (P, T, L) corresponding to an accumulated resource requirement may be identical to a (P, T, L) identifier index value of an H.265/HEVC base layer (i.e., an H.265/HEVC Version 1 standard excluding standard extensions), but corresponding specific parameter values are different. For example, in the H.265/HEVC Version 1 standard, when a value of Level for a Main profile is 3.1, value of the parameter MaxLumaPs is 983040; and in SHVC standard extension, when a value of Level of the accumulated resource requirement for Scalable Main profile is 3.1, value of the parameter MaxLumaPs is 2*983040=1966080. The values of the corresponding parameters A parameter value corresponding to values of (P, T, L) representing resource requirements for decoding various layers are identical to those in the H.265/HEVC Version 1 standard.
In a technical proposal JCTVC-R0043 submitted to JCT-VC standard organization, the defects of signalling of a decoding capability in a bitstream using the foregoing method are pointed out. It is also suggested that besides the foregoing mentioned information, the corresponding requirements of decoding resources needed can be signalled respectively for different combinations of layers for output (i.e. Partition) probably used or needed for a multi-layer video coding bitstream. Thus, two different decoders (a multi-layer video coding decoder implemented by using conventional H.265/HEVC Version 1 decoders and a multi-layer video coding decoder implemented directly) may make a decision of whether the bitstream can be correctly decoded.
Although the existing method can sufficiently indicate a decoder capability needed for decoding the multi-layer video coding bitstream, in the initial period of communication session setup, a terminal decoder does not have any information about a multi-layer video coding bitstream to be received, and thus cannot provide the needed output layer information for a server to judge and select the bitstream to be sent. If the server sends detailed information of the bitstream to a terminal previously and the terminal performs selection, multiple rounds of negotiation are to be conducted in a session setup process, which leads to low efficiency and high delay. When network communication situation changes, for example, the network transmission rate changes from low to high, in order to fully utilize network resources to achieve optimal user experience, the server needs to re-send the detailed information of the bitstream, and the terminal selects an optimal combination of layers under the current condition and feeds back to the server for requesting the bitstream. Thus, such multiple reciprocating session processes increase the network burdens and also occupy terminal processing resources.