In general, digital data is transmitted from a certain type of transmitting device to a certain type of receiving device. A transmitting device typically comprises an encoder encoding the data for transmission, and a receiving device typically comprises a decoder decoding the received data. A variety of digital data, such as video data, audio data, and audio/video data, can be transmitted from a transmitting device to a receiving device and outputted through a receiving device.
Dominating video compression and transmission formats comes from a family called a hybrid block-based motion-compensated transform video coder. Examples of the above coder is ITU-T VCEG video coding standards, which comprise H.261, MPEG-1, H.262/MPEG-2 video, H.263, MPEG-4 visual of VCEG (Video Coding Experts Group) and ISO/IEC MPEG (Moving Picture Experts Group) as well as the in-process draft standard H.264/AVC. Moreover, coding and compression standards are in place to synchronize and multiplex the signals for various other types of media, including still picture, audio, document, and webpage.
Video streams are generally made up in three types of frames or pictures, which are the infra frame (I frame), predictive frame (P frame), and bi-directionally predictive frame (B frame).
The I frame simply codifies the frame by discrete cosine transform, without using motion estimation/compensation. The P frame does motion estimation/compensation while referring to the I frame or other P frames, and then codifies the rest of the data by discrete cosine transform. The B frame does motion compensation, like the P frame, but carries out motion estimation/compensation from two frames on the time axis.
The sequence in video stream is defined by a segment called the group of pictures (GOP). In the structure of I, B, B, P, B, B, P, . . . , the GOP refers to the frames between an I frame to the next I frame. Generally, when displayed at an intended rate, the GOP is structured in a set of pictures having a predetermined duration (e.g., 0.5 seconds).
Generally, the MPEG-2 video stream or sequence is defined by a segment called GOP. Typically, the GOP is structured in a set of pictures having a duration of 0.5 seconds, when displayed at an intended rate.
As described above, the medium for delivering picture information such as video stream has been developed from the 2-dimensional terminal technology, such as television. In other words, as the development moves from black and white pictures to color pictures, as in SD (standard definition) television and high-resolution television (e.g., HDTV), the data amount of picture information is increasing.
Consequently, the current picture information is not 2-dimensional but 3-dimensional, and thus development of technologies related to 3-dimensional picture information is needed in order to deliver reproduce realistic, natural multimedia information.
However, since the technology standard such as MPEG-2 is for coding and decoding video from one view, the design of structure and process of data for expressing multi-view information is needed in order to codify multi-view video data. Although technology standards are proposing MVP (multi-view profile) for expanding the video used in MPEG-2 to a stereo video, this still does not suggest a proper solution for coding multi-view video.