The present invention relates to video processing, and, in particular, an apparatus and a method for real-time capable disparity estimation for virtual view rendering suitable for multi-threaded execution.
Video applications and the processing of video data becomes more and more important. In this context, the role of 3D videos increases. The next generation of 3D displays will be autostereoscopic. In contrast to conventional glasses-based displays, which have two views, autostereoscopic displays have an arbitrary number of views. With the advent of multi-view autostereoscopic displays, an apparatus and a method for generating an arbitrary number of views from stereoscopic content would be of great importance. Since most current and future content will only be available with two views, the missing views have to be calculated from the available ones. A well-known way to create virtual views is Depth Image Based Rendering (DIBR). Depth Image Based Rendering can generate these views from stereoscopic images and corresponding disparity maps.
Disparity maps will now be explained with reference to FIG. 13. On the left side of FIG. 13, two cameras 2, 3 are illustrated. The cameras are directed towards two objects 11, 12. The first object 11 is arranged close to the cameras 2, 3. The second object 12 is further away from the cameras 2, 3. Both cameras now create an image of the scene. The first (left) camera 2 creates image 20. The second (right) camera 3 creates image 30, as illustrated on the right side of FIG. 13. It should be noted that in video applications, a plurality of subsequent images are recorded by the cameras employed.
On the right side of FIG. 13, image 20 and image 30 illustrate the same scene from different perspectives, namely, image 20 illustrates the scene from the perspective of the left camera 2, while image 30 illustrates the scene from the perspective of the right camera 3. By this, the objects 11, 12 appear at different positions in the images 20, 30. In particular, the first object 11 is illustrated as object 21 in image 20 and as object 31 in image 30, while the second object 12 appears as object 22 in image 20 and as object 32 in image 30. As can be seen, the position of object 21 in image 20 differs more from the position of object 31 in image 30, than the position of object 22 in image 20 differs from the position of object 32 in image 30. This results from the fact that object 11 is located closer to the cameras 2, 3 than object 12, which is further away from the cameras 2, 3.
A corresponding disparity map 40 is illustrated in FIG. 13 below images 20 and 30. Portion 41 of the disparity map 40 indicates by how many positions the object 21 in FIG. 13 has to be shifted, to be at a position corresponding to the position of object 31 in image 30. The disparity map 40 indicates that all four pixels of the object 21 have to be shifted by three pixels, such that object 21 is shifted to the position of object 31. This is indicated by the four pixel disparity values in portion 41 n the disparity map 40. It should be noted, that in reality, a real image normally comprises much more pixels than the images 20 and 30, and, as a consequence, pixel disparity values may be much greater.
Portion 42 of the disparity map 40 indicates by how many positions the object 22 in FIG. 13 has to be shifted, to be at a position corresponding to the position of object 32 in image 30. The two pixel disparity values in the portion 42 of the disparity map 40 indicate that the two pixels of the object 22 have to be shifted by one pixel, such that object 22 is shifted to the position of object 32.
The disparity map 40 in FIG. 13 illustrates by how many pixels the corresponding pixel has to be shifted in a horizontal directions. Other disparity maps may also indicate that a pixel of a first image has to be shifted by a certain number of pixels in a vertical direction to be at a position of a corresponding pixel in a second image.
Depth Image Based Rendering needs disparity maps to render virtual views. To estimate these disparity maps in real-time at the consumer site, a fast and reliable disparity estimator has to be available which is also suitable to be implemented in hardware like Field-Programmable Gate Arrays (FPGAs) or Application-Specific Integrated Circuits (ASICs).
The state of the art provides a hybrid recursive matcher (HRM) described in:    Atzpadin, N., Kauff, P., Schreer, O. (2004): Stereo Analysis by Hybrid Recursive Matching for Real-Time Immersive Video Conferencing. IEEE Trans. on Circuits and Systems for Video Technology, Special Issue on Immersive Telecommunications, Vol. 14, No. 4, 321-334.
The hybrid recursive matcher (HRM) described in Atzpadin, N., Kauff, P., Schreer, O. (2004) utilizes the original concept of recursive block matching for motion compensation explained in:    De Haan, G., Biezen, P. W. A. C., Huijgen, H., Ojo, O. A. (1993): True-Motion Estimation with 3-D Recursive Search Block Matching. IEEE Trans. on Circuits and Systems for Video Technology, Vol. 3, No. 5, 368-379.
Furthermore, the hybrid recursive matcher (HRM) of Atzpadin, N., Kauff, P., Schreer, O. (2004) utilizes the extension of De Haan, G., Biezen, P. W. A. C., Huijgen, H., Ojo, O. A. (1993) towards hybrid recursive matching proposed by:    Kauff, P., Schreer, O., Ohm, J.-R. (2001): An Universal Algorithm for Real-Time Estimation of Dense Displacement Vector Fields. Proc. of Int. Conf. on Media Futures, Florence, May 2011,which is protected for disparity estimation since 2002, see:    Atzpadin, N., Karl, M., Kauff, P., Schreer, O. (2002), European Patent Application, Publication Number: EP 1 229 741 A2.
According to the state of the art presented in Atzpadin, N., Kauff, P., Schreer, O. (2004), it is proposed to determine disparity values of pixels by conducting a block recursion. This state of the art proposes to determine the disparity values of the pixels of an image by employing a special meander-like recursive structure, wherein a meander scan is conducted on images of arbitrarily shaped video objects. According to this teaching, for each pixel on the meander path, the previous disparity value on the meander scan path has to be calculated before, as it is needed to determine the disparity value of the actual pixel on the meander scan path. By this, the disparity values of the pixels have to be determined one after the other in subsequent order along the meander scan path.
The HRM presented in Atzpadin, N., Kauff, P., Schreer, O. (2004) has already been used in real-time for image sizes up to SD (Standard Definition) resolution. However, e.g., HD (High Definition) resolution necessitates fast processing, and it would therefore be highly desirable that an apparatus and method with the ability to create disparity maps in short time would be available.