A two-dimensional image is composed of monocular images on a single temporal axis, while a three-dimensional image is composed of multi-view images having two or more views, on a single temporal axis. Among the multi-view video encoding methods is a binocular video encoding method that encodes video images of two views corresponding to both eyes to display stereoscopic image. MPEG-2 MVP, which performs non-object-based encoding and decoding, is a representative method for non-object-based binocular video encoding. Its base layer has the same architecture of the base layer of the MPEG-2 main profile (MP), where encoding is performed by using only one image between the right-eye image and the left-eye image. Therefore, an image encoded in the MPEG-2 MVP method can be decoded with a conventional two-dimensional video decoder, and it can be also applied to a conventional two-dimensional video display mode. In short, it is compatible with a conventional two-dimensional video system.
An image of the enhancement layer is encoded using correlation information between the right and left images. That is, the MPEG-2 MVP method is based on an encoder that uses temporal scalability. Also, the base layer and the enhancement layer output frame-based two-channel bit streams each corresponding to the right and left-eye image. Current technologies related to binocular three-dimensional video encoding is based on the two-layer MPEG-2 MVP encoder. Also, the frame-based two-channel technology corresponding to the right and left-eye images in the base layer and the enhancement layer is based on the two-channel MPEG-2 MVP encoder.
U.S. Pat. No. 5,612,735 ‘Digital 3D/Stereoscopic Video Compression Technique Utilizing Two Disparity Estimates,’ granted on Mar. 18, 1997, discloses the related technology. This patent relates to a non-object-based encoding method that utilizes temporal scalability, and encodes a left-eye image in the base layer by using motion compensation and DCT-based algorithm, and encodes a right-eye image in the enhancement layer by using disparity information between the base layer and the enhancement layer, without using motion compensation between right-eye images, which is shown in FIG. 1.
FIG. 1 is a diagram showing a conventional method for estimating disparity compensation, which is performed twice. In the drawing, I, P and B denote three screen types defined in the MEPG standard. The screen I (Intra-coded) exists only in the base layer, and the screen is simply encoded without using motion compensation. In the screen P (predicated), motion compensation is performed using the screen I or another screen P. In the screen B (Bi-directional predicted coded), motion compensation is performed using the two screens that exist before and after the screen B on the temporal axis. The encoding order in the base layer is the same as that of MPEG-2 MP.
In the enhancement layer, only screen B exists. The screen B is encoded by using disparity compensation from the frame exiting on the same temporal axis and the screen existing after the frame.
Related prior art is disclosed in U.S. Pat. No. 5,619,256, ‘Digital 3D/Stereoscopic Video Compression Technique Utilizing Disparity and Motion Compensated Predictions,’ which is granted on Apr. 8, 1997. This method of U.S. Pat. No. 5,619,25 is also non-object-based. It utilizes temporal scalability, and encode a left-eye image in the base layer by using motion compensation and a DCT-based algorithm, and in the enhancement layer, it uses motion compensation between right-eye images and the disparity information between the base layer and the enhancement layer.
As shown above, there are various estimation methods for motion compensation and disparity compensation to perform encoding. The method of FIG. 2, which shows a conventional method for estimate motion and disparity compensation, is one known representative estimation method. In the base layer of FIG. 2, screen estimation is performed in the same estimation method of FIG. 1. The screen P of the enhancement layer is estimated from the screen I of the base layer to perform disparity compensation. Also, the screen B of the enhancement layer is estimated from the screen before in the same enhancement layer and the screen of the base layer on the same temporal axis to perform motion compensation and disparity compensation.
The two prior arts transmit only the bit stream outputted from the base layer, when the receiving end uses two-dimensional monocular display mode, and transmits all the bit streams outputted from the base layer and the enhancement layer to restore an image, when the receiving end adopts three-dimensional frame-based time lag display mode. However, when the display mode of the receiving end is a three-dimensional field-based time lag display mode, which is adopted in most PCs, the methods of the two patents have problems that the amount of image restoration and the decoding time delay are increased in the decoder and the transmission efficiency is decreased, because the inessential data, the even field object of a left-eye image and the odd field image of a right-eye image, should be dismissed.
There is a video encoding method that reduces right and left-eye images by half and transforms the right and left two-channel images into one-channel image. For this, five methods are disclosed in ‘3D Video Standards Conversion’, Andrew Woods, Tom Docherty and Rolf Koch, Stereoscoic Displays and Applications VII, California, Feb, 1996, Proceedings of the SPIE Vol.2653a.
In connection with the above technique, a method is suggested in U.S. Pat. No. 5,633,682, ‘Stereoscopic Coding System,’ granted on May 27, 1997. The non-object-based MPEG encoding of a conventional two-dimensional video image is performed by selecting the odd fields of a left-eye image and the even fields of a right-eye image and converting the two-channel image into one-channel image. This method has an advantage that the conventional MPEG encoding of a two-dimensional video image can be used, and when the field estimation is performed in the encoding process, the motion and disparity information can be used naturally. However, in case where frame estimation is performed, only motion information is used and disparity information is not considered. Also, when field estimation is performed, although the most correlated image is one that exists on the same temporal axis, the screen B is estimated from the screen I and the screen P that exist before and after the screen B to perform disparity compensation, although the most correlated image is not the screens I and P but another screen on the same temporal axis in the other part.
In addition, this method considers field-based time lag to display right and left images one after another on a field basis to form a three-dimensional video image. Accordingly, this method is not proper to a frame-based time lag display mode, in which the right and left-eye images are displayed simultaneously. Therefore, a method that employs an object-based encoder and decoder and restores an image by transmitting only essential bit streams according to the display mode of the receiving part, i.e., two-dimensional monocular display mode or three-dimensional video field/frame-based time lag display mode, is required in this technical field.