Multilayer pictures include a scalable video picture supporting scalability, a multi-view video picture supporting pictures from a plurality of viewpoints, a stereoscopic three-dimensional (3D) video picture, or the like, and encoding/decoding techniques using these multilayer pictures include scalable video coding, 3D video coding, or the like. The Joint Collaborative Team on Video Coding Extension Development (JCT-VC) are conducting studies on scalable video coding standards (for example, SHVC), while the Joint Collaborative Team on 3D Video Coding Extension Development (JCT-3V) are conducting studies on 3D video coding standards (for example, 3D-HEVC). The JCT-VC and JCT-3V are a group of video coding experts from the ITU-T Study Group 16 Visual Coding Experts Group (VCEG) and the ISO/IEC JTC 1/SC 29/WG 11 Moving Picture Experts Group (MPEG).
Scalable video standards include standards for advanced data formats and relevant technologies which enable users to watch base-layer video and enhancement-layer video with enhanced picture quality, image size or frame rate from the base-layer video to be suitable for various transmission and reproduction environments using scalability.
FIG. 1 illustrates a basic scalable video coding system considered in scalable video standards.
A transmitter side acquires picture contents having scalable information by downsampling an input picture. The acquired picture contents may include temporal, spatial and quality (signal-to-noise ratio (SNR)) scalability information on the picture. The picture contents is compressed by a scalable encoder using a scalable video encoding method, and a compressed bit stream is transmitted to a terminal through a network.
A receiver side decodes the transmitted bit stream by a scalable decoder using a scalable video decoding method to reconstruct a picture suitable for a user environment.
3D video standards include standards for advanced data formats and relevant technologies which support representation of not only stereoscopic images but also multi-view images input from a plurality of camera using real images and depth maps thereof.
FIG. 2 illustrates a basic 3D video system considered in 3D video standards.
A transmitter side acquires N-view (N≧2) picture contents using a stereo camera, a depth camera, a multi-view camera and a converter of converting a 2D picture into a 3D picture. The acquired picture contents may include N-view video information, depth map information thereof and side information related cameras. The N-view picture contents are compressed by a 3DV encoder using a multi-view video encoding method, and a compressed bit stream is transmitted to a terminal through a network.
A receiver side decodes the transmitted bit stream by a 3DV decoder using a multi-view video decoding method to reconstruct N-view pictures. The virtual-view pictures of N views or greater can be generated from the reconstructed N-view pictures by depth-image-based rendering (DIBR). The virtual-view pictures of the N views or greater are reproduced suitably for various 3D display apparatuses to provide pictures having 3D effect to users.
FIG. 3 illustrates an encoder/decoder supporting multilayer pictures used for scalable video coding of FIG. 1 or multi-view video coding of FIG. 2 according to an embodiment.
Referring to FIG. 3, the base layer may be independently encoded/decoded, and the enhancement layer may be encoded/decoded using encoded information on the base layer. Further, layers may be encoded/decoded dependently on each other using correlation information between the layers.