Video coding standards include the recommendation H.261 of the International Telecommunication Unit's Telecommunication Standardization Sector (ITU-T H.261), the MPEG-1 Video of the International Standardization Organization's (ISO/IEC) Motion Picture Experts Group, ITU-T H.262 or ISO/IEC MPEG-2 Video, ITU-T H.263, ISO/IEC MPEG-4 Visual, ITU-T H.264, which is also known as ISO/IEC MPEG-4 AVC, the scalable video coding (SVC) extension of H.264/AVC, and the multiview video coding (MVC) extension of H.264/AVC (Advanced Video Coding). The scalable video coding extension and the multi-view video coding extensions were included in the November 2007 and March 2009 releases of ITU-T Recommendation H.264, respectively.
An encoded bitstream according to H.264/AVC or its extensions, e.g., SVC and MVC, is either a NAL unit stream, or a byte stream by prefixing a start code to each NAL unit in a NAL unit stream. A NAL unit stream is a concatenation of a number of NAL units. A NAL unit comprises a NAL unit header and a NAL unit payload. The NAL unit header contains, among other items, the NAL unit type indicating whether the NAL unit contains a coded slice, a coded slice data partition, a sequence or picture parameter set, and so on.
In H.264/AVC and its extensions, parameters that remain unchanged through a coded video sequence are included in a sequence parameter set. In addition to parameters that are essential to the decoding process, the sequence parameter set may optionally contain video usability information (VUI), which includes parameters that are very important for buffering, picture output timing, rendering, and resource reservation. There are three NAL units specified to carry sequence parameter sets, the sequence parameter set NAL unit containing all the data for H.264/AVC VCL NAL units in the sequence, the sequence parameter set extension NAL unit containing the data for auxiliary coded pictures, and the subset sequence parameter set for MVC and SVC VCL NAL units. A picture parameter set contains such parameters that are likely to be unchanged in several coded pictures. No picture header is present in H.264/AVC bitstreams but the frequently changing picture-level data is repeated in each slice header and picture parameter sets carry the remaining picture-level parameters. H.264/AVC syntax allows for many instances of sequence and picture parameter sets, and each instance is identified with a unique identifier. Each slice header includes the identifier of the picture parameter set that is active for the decoding of the picture that contains the slice, and each picture parameter set contains the identifier of the active sequence parameter set. Consequently, the transmission of picture and sequence parameter sets does not have to be accurately synchronized with the transmission of slices. Instead, it is sufficient that the active sequence and picture parameter sets are received at any moment before they are referenced, which allows transmission of parameter sets using a more reliable transmission mechanism compared to the protocols used for the slice data.
Coded video bitstreams may include extra information to enhance the use of the video for a wide variety purposes. For example, supplemental enhancement information (SET) and video usability information (VUI), as defined in H264/AVC, provide such a functionality. The H.264/AVC standard and its extensions include the support of supplemental enhancement information (SEI) signaling through SET messages. SET messages are not required by the decoding process to generate correct sample values in output pictures. Rather, they are helpful for other purposes, e.g., error resilience and display. H.264/AVC contains the syntax and semantics for the specified SEI messages, but no process for handling the messages in the recipient is defined. Consequently, encoders are required to follow the H.264/AVC standard when they create SEI messages, and decoders conforming to the H.264/AVC standard are not required to process SEI messages for output order conformance. One of the reasons to include the syntax and semantics of SET messages in H.264/AVC is to allow system specifications, such as 3GPP multimedia specifications and DVB specifications, to interpret the supplemental information identically and hence interoperate. It is intended that system specifications can require the use of particular SEI messages both in encoding end and in decoding end, and the process for handling SEI messages in the recipient may be specified for the application in a system specification.
In multi-view video coding, video sequences output from different cameras, each corresponding to different views, are encoded into one bit-stream. After decoding, to display a certain view, the decoded pictures which belong to that view are reconstructed and displayed. It is also possible that more than one view is reconstructed and displayed.
In the multi-view video coding pictures of a video signal can be categorized e.g. as anchor pictures or non-anchor pictures. An anchor picture is a coded picture in which all slices reference only slices with the same temporal index, i.e., only slices in other views and not slices in earlier pictures of the current view. An anchor picture can be signaled by setting a parameter anchor_pic_flag to a first value such as the logical 1. After decoding the anchor picture, all following coded pictures in display order shall be able to be decoded without inter-prediction from any picture decoded prior to the anchor picture. If one view component of a coded picture is an anchor view component, then all other view components of the same coded picture are also anchor view components. Consequently, decoding of any view can be started from a temporal index that corresponds to anchor pictures. If a picture is a non-anchor picture, the parameter anchor_pic_flag is set to a second value, such as the logical 0.
One of the problems in stereoscopic video coding is the expansion of the bit rate compared to conventional single-view video. Even with the inter-view prediction techniques provided by the multi-view video coding, the bit rate of a stereo video bit stream is often close to double of that compared to the respective single-view bit stream. Such a bit rate increase is many times not acceptable when trading off transmission throughput or storage space requirements with the expected volume of devices capable of viewing stereoscopic content. Hence, methods to achieve additional compression remain a big challenge in stereoscopic and multi-view video coding.