High Efficiency Video Coding (HEVC) is a new video coding standard currently being developed in Joint Collaborative Team-Video Coding (JCT-VC). JCT-VC is a collaborative project between Moving Picture Experts Group (MPEG) and International Telecommunication Union-Telecommunication Standardization Sector (ITU-T). An HEVC Model (HM) has been defined that includes a number of new tools and is considerably more efficient than H.264/Advanced Video Coding (AVC).
A picture in HEVC is partitioned into one or more slices, where each slice is an independently decodable segment of the picture. This means that if a slice is missing, for instance got lost during transmission, the other slices of that picture can still be decoded correctly. In order to make slices independent, they do not depend on each other. No bitstream element of another slice of the same picture is required for decoding any element of a particular slice.
Each slice contains a slice header which independently provides all required data for the slice to be independently decodable. One example of a data element present in the slice header is the slice address, which is used for the decoder to know the spatial location of the slice. Another example is the slice quantization delta which is used by the decoder to know what quantization parameter to use for the start of the slice. There are many more data elements in the slice header.
HEVC also has mechanisms for handling reference pictures, which are previously decoded pictures to be used for decoding of a current picture. A reference picture in HEVC is a picture in the decoded picture buffer (DPB) that is available for reference by being marked “used for reference.” The pictures to be used as reference pictures are included in reference picture lists, which for HEVC is similar to the reference picture list in H.264. The reference picture lists are then used in the decoding process of the current slice in the current picture.
HEVC also defines a temporal_id for each picture, corresponding to the temporal layer that the picture belongs to. Temporal layers are ordered and are used for temporal scalability where higher temporal layers can be removed without affecting the decoding of lower temporal layers. That means that if temporal layer A is higher than temporal layer B, a picture belonging to temporal layer A can use a picture from temporal layer B for prediction but a picture belonging to temporal layer B can not use a picture from temporal layer A for prediction. In HEVC, it is proposed to use absolute signaling of reference pictures instead of signaling reference picture modifications in a relative way as in previous standards, e.g. H.264. The absolute signaling is realized by signaling what reference pictures to keep to the decoder in a Buffer Description for each picture explicitly or by signaling them through a reference to a Sequence Parameter Set (SPS). The Buffer Description is also referred to as Reference Picture Set (RPS).
Picture Order Count (POC) is used in HEVC to define the display order of pictures and also to identify reference pictures. In the first drafts of HEVC, not only POC was signaled for each reference picture in a Buffer Description but also temporal_id. The values of POC and temporal_id in the Buffer Description must be identical to the values of POC and temporal_id signaled in the slice header of the reference picture to which it is referring. temporal_id is used during the Buffer Description decoding process for reference pictures that are included in the Buffer Description but not available in the Decoded Picture Buffer (containing decoded pictures) in order to deduce if a picture that is not available or present in the decoding picture buffer has been unintentionally lost or correctly removed. If the reference picture in a Buffer Description has higher temporal_id than the temporal_id of the current picture it is deemed correctly removed and the decoding process can continue, otherwise it is deemed unintentionally lost and the current picture may not be correctly decodable.
It can be noted that the process of deducing whether a missing picture has been unintentionally lost or correctly removed is independent of the actual Buffer Description decoding process and could be performed before or after the Buffer Description decoding process.
Temporal_id is also used in the reference picture list construction process. Reference pictures that belongs to higher temporal layers than the temporal layer of the current picture are not included in reference picture lists of the current picture.
A schematic illustration of the decoding process at a high level as proposed in HEVC can be seen in FIG. 1.