High Efficiency Video Coding (HEVC) is a next generation video coding standard which is currently under development and standardization. HEVC aims at substantially improving coding efficiency compared to the state-of-the-art (H.264/AVC, aka MPEG-4 AVC), especially for high-resolution video content. The initial focus of the HEVC development is on mono video, i.e., one camera view only. However, given the relevance of multi-resolution and multi-view 3D representations, extensions towards scalable coding and multi-view video or depth map coding are planned or ongoing. Those extensions require multi-layer support.
An HEVC bitstream without extensions can be considered as a single-layer bitstream, i.e., a bitstream representing the video in a single representation, e.g., as a single video view having a single resolution and single quality. In multi-layer extensions, an HEVC single-layer bitstream is typically included as a “base layer”. For instance, in multi-view 3D extensions, additional layers may represent additional video views, captured from different camera positions, depth information, or other information. Further, in scalability extensions, additional layers may represent the video in higher video picture resolutions, higher pixel fidelity, other color-spaces, or alike, providing improved video quality relative to the base layer.
HEVC uses a video packetization concept based on Network Abstraction Layer (NAL) units. A compressed video bitstream consists of a sequence of NAL units. Each NAL unit can carry coded video data, so-called Video Coding Layer (VCL) data, also referred to as “coded slice”, parameter data needed for video decoding, so-called Parameter Sets (PSs), or supplementary data, so-called Supplementary Enhancement Information (SEI). Each NAL unit consists of a NAL unit header and a NAL unit payload. The NAL unit header consists of a set of identifiers which can be used by networks to manage the compressed bit streams. For example, in order to reduce the transmission bitrate of a video in case of limited network bandwidth, some NAL units may be discarded, based on information carried in the NAL unit headers, so as to minimize the quality degradation caused by discarding. This process is denoted as “bitstream thinning”.
In multi-layer HEVC extensions, each NAL unit will have a NAL unit header that includes elements that indicate which layer of the multiple layers the NAL unit is associated with. Such identifiers identify, e.g., a temporal layer (temporal_id), a spatial layer (dependency_id), a fidelity layer (quality_id), or a more generic layer (layer_id, or layer_id_plus1).
HEVC parameter sets (PSs) contain parameters needed in the decoding process. Examples for parameters needed in the decoding process include the decoder profile, i.e., the mode of operation specifying the supported decoding algorithms, the decoder level, specifying implementation limits such as maximum supported picture size, frame rate, and bit rate, the video picture dimensions (video picture width and height), and parameters related to configuration of algorithms and settings necessary for decoding the compressed bitstream. Several different types of parameter sets exist, in particular Sequence Parameter Sets (SPSs), Picture Parameter Sets (PPSs), and Adaptation Parameter Sets (APSs). Introduction of further parameter set types, such as the Video Parameter Set (VPS) and the Group Parameter Set (GPS), is under discussion.
The SPS contains parameters that change very infrequently, and which therefore are valid for a complete video sequence. The PPS contains parameters that may change more frequently than SPS parameters, but typically do not change very frequently. The APS contains information that typically changes frequently, e.g., with every coded picture. In the envisioned scalable/3D extensions to HEVC, it is likely that these PS concepts will be re-used, and PSs will be present in different layers. In that context, the proposed VPS is envisioned to contain information that applies identically for several or all layers of a multi-layer bitstream, and which changes infrequently. Parameter sets typically have an identifier, “PS ID” by which they can be referred to.
In the HEVC decoding process, PSs are “activated” when they are referred to by NAL units that contain coded slices, i.e., coded video data. When a PS is active, the values of the syntax elements, i.e., parameters, in the PS can be accessed by the decoder and used in the decoding process.
In the current draft HEVC specification, each parameter set is identified by a parameter set identifier, also referred to as parameter set reference. For instance, each SPS is associated with an identifier seq_parameter_set_id, each PPS is associated with an identifier pic_parameter_set_id, and each APS is identified by an identifier aps_id. Likewise, each VPS may be identified by an identifier vps_id, and each GPS may be identified by an identifier gps_id. The identifiers are typically coded using Variables Length Codes (VLC), such as “Exp-Golomb” codes, which represent integer values 0, 1, 2, 3, . . . , where coding of lower values requires fewer bits.
In the current draft HEVC specification, the following mechanisms for activating parameter sets exist:                A PPS is referenced by reference to its pic_parameter_set_id in the slice header, i.e., by a field in a coded slice, and the referenced PPS is activated when the coded slice is decoded. Zero or one PPS can be active at each time.        SPSs are referenced by reference to their respective seq_parameter_set_id by PPSs. When a PPS is activated, then the referenced SPS is activated, too. Zero or one SPS can be active at a time.        APSs are referenced by reference to their respective aps_id in the slice header, similar as PPS, and activated when the slice is decoded.        A VPS (not in the current HEVC draft, but under discussion) may be referenced by reference to its vps_id by SPSs and is activated when a referencing SPS is activated.        Alternatively, a GPS is under discussion which, if introduced, would replace the activation processes for APS, PPS, and SPS. A GPS would be activated by reference to its gps_id in the slice header when the slice is decoded. The GPS would include references to a PPS, SPS, zero, one or several APS, and potentially a VPS. When the GPS is activated, other PSs referenced in the GPS would be activated, too.        
In the draft HEVC single-layer specification, parameter sets are identified by their respective parameter set identifier (PS ID). In some cases, this may not be efficient in a multi-layer HEVC extension since coding of parameter set identifiers may require too many bits.
Another problem associated with the prior art is related to the activation chain for activation of SPSs. An SPS is activated when referenced by a PPS which is activated by being referenced by a slice header (SH) which is being decoded. This particular activation chain may be illustrated as SH→PPS→SPS. In a multi-layer HEVC video representation, different layers typically require specific SPSs, i.e., some of the SPSs cannot be re-used across several layers. Due to the activation chain SH→PPS→SPS, the presence of a dedicated SPS for a certain layer requires presence of a separate PPS, since the SPS is referenced by the PPS. Adding additional PPSs for that purpose causes bitrate overhead, i.e., increases the amount of bits to be transmitted.