Technological Field
This disclosure relates to multi-layer video coding. More particularly, this disclosure relates to methods for conformance and interoperability in multi-layer video coding, including signaling of profile, tier, and level information, signaling of output layer sets, the use of hypothetical reference decoder (HRD) parameters, and bitstream conformance tests.
Related Art
Digital video capabilities can be incorporated into a wide range of devices, including digital televisions, digital direct broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), laptop or desktop computers, tablet computers, e-book readers, digital cameras, digital recording devices, digital media players, video gaming devices, video game consoles, cellular or satellite radio telephones, so-called “smart phones,” video teleconferencing devices, video streaming devices, and the like. Digital video devices implement one or more video coding techniques. The video devices may transmit, receive, encode, decode, and/or store digital video information more efficiently by implementing such video coding techniques.
Video coding techniques include, without limitation, those described in the standards defined by ITU-T H.261, ISO/IEC MPEG-1 Visual, ITU-T H.262 or ISO/IEC MPEG-2 Visual, ITU-T H.263, ISO/IEC MPEG-4 Visual, and ITU-T H.264 or ISO/IEC MPEG-4 Advanced Video Coding (AVC) (including its Scalable Video Coding (SVC) and Multiview Video Coding (MVC) extensions), and the relatively new High Efficiency Video Coding (HEVC) standard. The HEVC standard was recently finalized by the Joint Collaboration Team on Video Coding (JCT-VC) of the Video Coding Experts Group (VCEG) of the International Telecommunication Union's Telecommunication Standardization Sector (ITU-T) and the Moving Picture Experts Group (MPEG), formed by the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC). The latest working draft (WD) of the HEVC specification, referred to as HEVC WD10, is available from phenix.int-evry.fr/jct/doc_end_user/documents/12_Geneva/wg11/JCTVC-L1003-v34.zip. The multiview extension to HEVC, namely MV-HEVC, is also being developed by the JCT-3V. A recent working draft of MV-HEVC WD3 hereinafter, is available from phenix.it-sudparis.eu/jct2/doc_end_user/documents/3_Geneva/wg11/JCT3V-C1004-v4.zip. The scalable extension to HEVC, named SHVC, is also being developed by the JCT-VC. A recent working draft of SHVC and referred to as SHVC WD2 hereinafter is available from phenix.int-evry.fr/jct/doc_end_user/documents/13_Incheon/wg11/JCTVC-M1008-v1.zip.
Video coding techniques include spatial (intra-picture) prediction and/or temporal (inter-picture) prediction to reduce or remove redundancy inherent in video sequences. For block-based video coding, a video slice (e.g., a video frame or a portion of a video frame) may be partitioned into video blocks, which may also be referred to as treeblocks, coding units (CUs), and/or coding nodes. Video blocks in an intra-coded (I) slice of a picture are encoded using spatial prediction with respect to reference samples in neighboring blocks in the same picture. Video blocks in an inter-coded (P or B) slice of a picture may use spatial prediction with respect to reference samples in neighboring blocks in the same picture or temporal prediction with respect to reference samples in other reference pictures. Pictures may be referred to as frames, and reference pictures may be referred to a reference frames.
Spatial or temporal prediction results in a predictive block for a block to be coded. Residual data represents pixel differences between the original block to be coded and the predictive block. An inter-coded block is encoded according to a motion vector that points to a block of reference samples forming the predictive block, and the residual data indicating the difference between the coded block and the predictive block. An intra-coded block is encoded according to an intra-coding mode and the residual data. For further compression, the residual data may be transformed from the pixel domain to a transform domain, resulting in residual transform coefficients, which then may be quantized. The quantized transform coefficients, initially arranged in a two-dimensional array, may be scanned in order to produce a one-dimensional vector of transform coefficients, and entropy coding may be applied to achieve even more compression.
A multi-view coding bitstream may be generated by encoding views, e.g., from multiple perspectives. Some three-dimensional (3D) video standards have been developed that make use of multiview coding aspects. For example, different views may transmit left and right eye views to support 3D video. Alternatively, some 3D video coding processes may apply so-called multiview plus depth coding. In multiview plus depth coding, 3D video bitstream may contain not only texture view components, but also depth view components. For example, each view may comprise one texture view component and one depth view component.