Digital video capabilities can be incorporated into a wide range of devices, including digital televisions, digital direct broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), laptop or desktop computers, tablet computers, e-book readers, digital cameras, digital recording devices, digital media players, video gaming devices, video game consoles, cellular or satellite radio telephones, so-called “smart phones,” video teleconferencing devices, video streaming devices, and the like. Digital video devices implement video compression techniques, such as those described in the standards defined by MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264/MPEG-4, Part 10, Advanced Video Coding (AVC), the High Efficiency Video Coding (HEVC) standard, and extensions of such standards. The video devices may transmit, receive, encode, decode, and/or store digital video information more efficiently by implementing such video compression techniques.
Video compression techniques perform spatial (intra-picture) prediction and/or temporal (inter-picture) prediction to reduce or remove redundancy inherent in video sequences. For block-based video coding, a video slice (i.e., a video frame or a portion of a video frame) may be partitioned into video blocks, which may also be referred to as treeblocks, coding units (CUs) and/or coding nodes. Video blocks in an intra-coded (I) slice of a picture are encoded using spatial prediction with respect to reference samples in neighboring blocks in the same picture. Video blocks in an inter-coded (P or B) slice of a picture may use spatial prediction with respect to reference samples in neighboring blocks in the same picture or temporal prediction with respect to reference samples in other reference pictures. Pictures may be referred to as frames, and reference pictures may be referred to as reference frames.
Spatial or temporal prediction utilizes a predictive block. Residual data represents pixel differences between the original block to be coded and the predictive block. An inter-coded block is encoded according to a motion vector that points to a block of reference samples forming the predictive block, and the residual data indicating the difference between the coded block and the predictive block. An intra-coded block is encoded according to an intra-coding mode and the residual data. For further compression, the residual data may be transformed from the pixel domain to a transform domain, resulting in residual transform coefficients, which then may be quantized. The quantized transform coefficients, initially arranged in a two-dimensional array, may be scanned in order to produce a one-dimensional vector of transform coefficients, and entropy coding may be applied to achieve even more compression.
A given coded video sequence encoded to a bitstream includes an ordered sequence of coded pictures. In the H.264/AVC and HEVC standards, the decoding order of the coded pictures for a bitstream is equivalent to the ordered sequence. However, the standards also support an output order of decoded pictures that differs from the decoding order, and in such cases each of the coded pictures is associated with a picture order count (POC) value that specifies the output order for the picture in the video sequence.
Video timing information for a video sequence may be signaled in syntax elements of one or more syntax structures (alternatively referred to as “parameter set structures” or simply “parameter sets”). The syntax structures may include a sequence parameter set (SPS) that includes coding information that applied to all slices of a coded video sequence. The SPS may itself include parameters referred to as video usability information (VUI), which include hypothetical reference decoder (HRD) information as well as information for enhancing the use of the corresponding video sequence for various purposes. The HRD information may itself be signaled using a HRD syntax structure includable within other syntax structures such as the VUI syntax structure. The syntax structures may also include a video parameter set (VPS) that describes characteristics of a corresponding video sequence, such as common syntax elements shared by multiple layers or operation points as well as other operation point information that may be common to multiple sequence parameter sets, such as HRD information for various layers or sub-layers.