Recently there has been much interest in providing 3-D images on 3-D image displays. It is believed that 3-D imaging will be, after color imaging, the next great innovation in imaging. We are now at the advent of introduction of auto-stereoscopic displays for the consumer market.
A 3-D display device usually has a display screen on which the images are displayed.
Basically, a three dimensional impression can be created by using stereo pairs, i.e. two slightly different images directed at the two eyes of the viewer.
There are several ways to produce stereo images. The images may be time multiplexed on a 2D display, but this requires that the viewers wear glasses with e.g. LCD shutters. When the stereo images are displayed at the same time, the images can be directed to the appropriate eye by using a head mounted display, or by using polarized glasses (the images are then produced with orthogonally polarized light). The glasses worn by the observer effectively route the views to each eye. Shutters or polarizer's in the glasses are synchronized to the frame rate to control the routing. To prevent flicker, the frame rate must be doubled or the resolution halved with respect to the two dimensional equivalent image. A disadvantage of such a system is that glasses have to be worn to produce any effect. This is unpleasant for those observers who are not familiar with wearing glasses and a potential problem for those already wearing glasses, since the additional pair of glasses does not always fit.
Instead of near the viewer's eyes, the two stereo images can also be split at the display screen by means of a splitting screen such as a parallax barrier, as e.g. shown in U.S. Pat. No. 5,969,850. Such a device is called an auto-stereoscopic display since it provides itself (auto-) a stereoscopic effect without the use of glasses. Several different types of auto-stereoscopic devices are known.
Whatever type of display is used, the 3-D image information has to be provided to the display device. This is usually done in the form of a video signal comprising digital data.
Because of the massive amounts of data inherent in digital imaging, the processing and/or the transmission of digital image signals form significant problems. In many circumstances the available processing power and/or transmission capacity is insufficient to process and/or transmit high quality video signals. More particularly, each digital image frame is a still image formed from an array of pixels.
The amounts of raw digital information are usually massive requiring large processing power and/or or large transmission rates which are not always available. Various compression methods have been proposed to reduce the amount of data to be transmitted, including for instance MPEG-2, MPEG-4 and H.263.
These compression methods have originally been set up for standard 2D images.
The generation of 3-D images is conventionally done by converting an incoming encoded 2D-video signal into a 3-D video signal at the display side. The incoming 2D data sequences are converted into 3D sequences just before displaying the video sequence. Often at the display side to the pixels in the 2-D image a depth map is added, said depth map providing information on the depth of the pixel within the image and thus providing 3D information. Using the depth map for an image a left and right image can be constructed providing a 3D image. Relative depth of objects within a 2D-image may, for instance, be deduced from the focus (in-focus, out-of-focus) or how objects obscure each other.
Since 3-D information generated at the display side will have imperfections, there is a need for generating improved 3D information. 3D information that is generated at the acquisition side can provide improved 3D image rendering due to                the possibility of more powerful computing at the acquisition side        the possibility of off line processing        the possibility of manual intervention        
If 3D information is generated at the acquisition side, this information needs to be transmitted and in order to have a low extra overhead in terms of bit rate, compression of 3D information is required. Preferably the compression (or encoding) of the 3D information is performed in such a manner that compression of 3D information can be implemented using existing compression standards with only relatively small adjustments.
Often 3D information is given in the form of an image with a depth map (z-map).
Better depth maps will enable 3D displays to produce more depth. Increase in depth reproduction will, however, result in visible imperfections around depth discontinuities. These visible imperfections undermine greatly the positive effects of improved depth maps. Furthermore the transmission capacities are limited and coding efficiency is very important.
It is thus an object of the invention to provide a method for encoding 3D image data at the transmission side wherein visible imperfections around depth discontinuities for a displayed image are reduced while keeping the amount of data within the encoded data in bounds. Preferably the coding efficiency is large. Also, preferably, the method is compatible with existing encoding standards.
It is a further object to provide an improved encoder for encoding a 3D video signal, a decoder for decoding a 3D video signal and a 3D video signal.