With the emergence of 3D displays in the market, including stereoscopic and auto stereoscopic displays, there is a strong demand for more 3D content to be available. It is typically a challenging task to code the 3D content usually involving multiple views and possibly corresponding depth maps as well. Each frame of 3D content may require the system to handle a huge amount of data. In typical 3D video applications, multiview video signals need to be transmitted or stored efficiently due to limitations in transmission bandwidth, storage limitations, and processing limitation, for example. Multiview Video Coding (MVC) extends H.264/Advanced Video Coding (AVC) using high level syntax to facilitate the coding of multiple views. This syntax aids in the subsequent handling of the 3D images by image processors.
H.264/AVC, though designed ostensibly for 2D video, can also be used to transmit stereo contents by exploiting a frame-packing technique. The technique of frame-packing is presented simply as follows: on the encoder side, two views or pictures are generally downsampled for packing into one single video frame, which is then supplied to a H.264/AVC encoder for output as a bitstream; on the decoder side, the bitstream is decoded and the recovered frame is then unpacked. Unpacking permits the extraction of the two original views from the recovered frame and generally involves an upsampling operation to restore the original size to each view so that the views can be rendered for display. This approach is able to be used for two or more views, such as with multi-view images or with depth information and the like.
Frame packing may rely on the existence of ancillary information associated with the frame and its views. Supplemental enhancement information (SEI) messages may be used to convey some frame-packing information. As an example, in a draft amendment of AVC, it has been proposed that an SEI message be used to inform a decoder of various spatial interleaving characteristics of a packed picture, including that the constituent pictures are formed by checkerboard spatial interleaving. By employing the SEI message, it is possible to encode the checkerboard interleaved picture of stereo video images using AVC directly. FIG. 26 shows a known example of checkerboard interleaving. To date, however, the SEI message contents and the contents of other high level syntaxes have been limited in conveying information relevant to pictures or views that have been subjected to frame packing.