1. Field of the Invention
The present invention relates to a method and apparatus to divide image blocks, and more particularly, to a method and apparatus to improve the quality of intermediate images. To achieve such improvements, this invention proposes a new standard to divide image blocks that can prevent flickering in synthesized intermediate images when splitting image blocks using quadtree disparity estimation. The image block splitting is followed by disparity estimation required for synthesis of intermediate views used to represent a three dimensional (3D) image.
2. Description of the Related Art
To realize an imaging and communication system that provides a high degree of realism and naturalness, it is necessary to develop 3D image processing technology that can naturally represent images according to human visual characteristics. 3D image processing employs binocular parallax, which is a difference in depth of an object perceived by left and right eyes. Processing and transmission of binocular images is of great concern in the field of next generation visual communication.
However, one significant problem associated with such image processing and transmission is a large amount of information contained in stereoscopic images, considering the fact that most images are color or moving images as well as transmission rate over a transmission line and processing rate of a transmission (or image processing) system. To overcome this problem, it is necessary to research a technique to efficiently and easily compress a large amount of information while maintaining the quality of a stereoscopic image.
Research is being conducted to develop a method that involves estimating variation in objects within an image using the fact that left and right views have high correlation, instead of independently encoding the two views, transmitting the variation information and either the left or right view, and compensating and restoring the transmitted view to binocular images at a receiving terminal.
Further, when the viewpoint of an observer moves or there are several observers, multi-view images are needed to create a natural stereoscopic image. However, since independent transmission of all multi-view images excessively increases the amount of information, reconstruction of multi-view images from binocular images restored at the receiving terminal of a binocular image transmission system, often called intermediate view reconstruction (IVR) or intermediate view synthesis, is used. In this case, reconstruction can be performed using intermediate view interpolation or extrapolation by obtaining variation information related to intermediate views from information on variation between the binocular images.
3D images are compressed and decompressed using an MPEG technique applied to two dimensional (2D) images. In particular, compression, transmission, and decompression of 3D images for digital broadcasting are performed using MPEG-2, which is a standard for digital broadcasting.
As is widely known in the art, MPEG-2 uses block-based compression schemes to compress 2D images. Currently, these schemes are also applied to compression of 3D images and are known as the most efficient method of 3D image compression.
Block-based compression is performed in blocks of a fixed size (for example, 16×16 pixels), each of which is called a macro block. Compression is achieved by motion estimation in units of a macro block and calculating a motion vector, which is the resulting value of estimation, and prediction error. When macro blocks of a fixed size are used in synthesizing intermediate views from binocular images, as in 2D images for stereoscopic depth perception by a viewer of 3D images as mentioned above, degradation in the quality of intermediate views may occur. In particular, quality degradation due to blurring of the edges of an image becomes a significant problem.
To overcome this problem, various quadtree disparity estimation approaches that can prevent quality degradation by splitting a macro block of a fixed size into smaller sub blocks near the edge of an image have been proposed (1998 SPIE Paper: Anthony Mancini and Janusz Konrad, “Robust Quadtree-based Disparity Estimation for the Reconstruction of Intermediate Stereoscopic Images”, and IEEE 0-7803-6685-9/01 Paper: D. R. Clewer, “Efficient Multiview Image Compression Using Quadtree Disparity Estimation”).
According to a quadtree disparity estimation approach, a block matching technique is used to calculate a mean absolute difference (MAD) of each macro block in left-eye and right-eye views and MADs of four sub blocks into which the macro block is divided. Then, if a ratio Rmadsub1 of the maximum sub block MAD to the minimum sub block MAD within a macro block is less than a predetermined threshold, disparity between binocular images is estimated in units of macro blocks. Conversely, if Rmadsub1 is greater than the threshold, disparity is re-estimated for each sub block. Since a method of calculating MADs is described in detail in literature including the above-cited references, a detailed description will be omitted.
FIG. 1A shows a conventional block splitting algorithm using quadtree disparity estimation. As mentioned above, a quadtree disparity estimation approach involves splitting an N×N macro block into four (N/2)×(N/2) sub blocks and estimating disparity in units of sub blocks when the ratio of the maximum sub block MAD to the minimum sub block MAD within the macro block is greater than a predetermined threshold. The conventional block splitting process using the quadtree disparity estimation approach will now be described in detail with reference to FIG. 1A.
If an N×N macro block that is the highest level block in disparity estimation (hereinafter called “Large Macro block (LMB)”) is input (operation S10), block matching between binocular images is implemented to estimate disparity between the binocular images and verifies if the resulting estimate is correct. The adequacy and correctness of estimation is determined by Rmadsub1, which denotes the ratio of the maximum MAD of any sub block (lower level block) within a macro block (higher level block) to the minimum MAD of any sub block within the same macro block. As Rmadsub1 becomes greater, the disparity difference between regions in a block becomes greater, and thus this estimation becomes less accurate. If the resulting estimate is verified as the most ideal one, which means there is little disparity difference between regions within the block, Rmadsub1 approximates 1, which is the minimum value.
Next, in operation S11, it is determined whether to split the LMB. This is done by determining whether Rmadsub1 is greater than a first threshold value Rth1, which is the reference value used to judge whether to split the LMB. Rth1 is decided by experiment.
If Rmadsub1 is less than Rth1, which means disparity present between binocular images is low, LMB is not split (operation S111) and disparity is estimated in units of LMBs (operation S16). Then, in operation S17 it is decided whether to split the next LMB.
If Rmadsub1 is greater than Rth1, the LMB is split into four (N/2)×(N/2) sub blocks (hereinafter called “Middle Sub Blocks (MSBs)”) (operation S12), and it is determined whether to split each MSB (operation S13). For this determination, the MAD of each (N/4)×(N/4) sub block (hereinafter called “Small Sub Block (SSB)”) within an MSB is calculated. Then, Rmadsub2, the ratio of the maximum SSB MAD to the minimum SSB MAD within an MSB, is calculated and compared to a second threshold value Rth2, which is a reference value to determine whether to split the MSB. Like Rth1, Rth2 is decided by experiment.
If Rmadsub2 is less than Rth2, the MSB is not split (operation S14) and disparity is estimated in units of MSBs (operation S16). Then, in step S17 it is determined whether to split the next LMB.
If Rmadsub2 is greater than Rth2, the MSB is split into four (N/4)×(N/4) SSBs (operation S15). Then, in operation S17 it is determined whether to split the next LMB.
FIG. 1B shows image blocks produced in a block splitting procedure using a quadtree disparity estimation approach. Each block is contained in a 720×288 image frame. A block splitting technique proposed by the present invention is applied to the image blocks shown in FIG. 1B. For an image created by interlaced scanning, in particular, this splitting may be performed in frames or fields. As mentioned above, the conventional block splitting algorithm splits a higher level block into four lower level blocks in order to re-estimate disparity when the maximum to minimum MAD ratio of the lower level blocks exceeds a single threshold, thus obtaining more detailed intermediate views.
However, a little change can occur in disparity due to shaking of the camera, changes in lighting conditions, and noise components such as dust, even in a still image or low complexity image (with less motion). Since blocks in this kind of image may have MAD ratios near a threshold, the determination of whether these blocks are split is highly dependant on such changes in disparity. Thus, disparity estimates suffer from a significant change when synthesizing intermediate views for such an image block. In this case, since the conventional block splitting algorithm applies each single threshold when splitting an LMB into four MSBs and an MSB into four SSBs, flickering can occur in synthesized intermediate views.
Furthermore, the conventional block splitting algorithm in the quadtree disparity estimation approach cannot prevent degradation in quality of intermediate views, in particular, flickering near edges, by adopting a single threshold.