As part of reconstructing three dimensional (3D) information from a plurality of two dimensional (2D) image views of a scene, an image processing system typically first finds and matches corresponding pixels across the plurality of 2D image views. To recover 3D information from a given number of 2D images, an essential step is to find/match the corresponding pixels to each pixel in a view. These corresponding pixels typically are the resulting projection of a voxel back into these 2D image views.
A common way of finding these corresponding pixels for each pixel in a view is to optimize and match a matching objective function. Matching is a similarity measurement of the corresponding pixels. The lower the matching error is, the more similar the corresponding pixels are. Nevertheless, matching algorithms often find many different sets of corresponding pixels to be potential matches.
As illustrated in FIG. 1, these potential matches, when shown on a matching curve 106, are indicated by local minimums 108, which are below a predefined threshold value, along a search path on the curve from zero outwardly following the X-axis 104. The search path translates into 3D depth values for a pixel's matching curve measured along the X-axis. The Y-axis 102 translates into matching error values for the pixel's matching curve measured along the Y-axis. A matching curve is associated with each pixel/feature in a view to be corresponded across the plurality of views.
A definite depth value for a pixel is difficult to be determined at the presence of multiple local minima 108. The goal of optimization algorithms have been to reduce these local minimums 108 into a global minimum 110. Still, a global minimum 110 does not guarantee the correct depth for a pixel. This is especially true when the scene has a large region without texture.
In reality, both types of matching curves, i.e., with local minima and with global minimum, always co-exist within the pixels of a view. More often a matching curve yields local minima rather than a global minimum.
Recent work by the present inventor, was disclosed in a publication entitled “3-D Visual Modeling and Virtual View Synthesis: A Synergistic, Range-Space Stereo Approach Using Omni-Directional Images”, in November 2000. The published approach discusses making use of characteristics of matching curves for robust matching. The disclosed method searched, matched, and rendered directly in 3D space along virtual space, starting from a given virtual viewpoint. A virtual viewpoint is a viewpoint which does not coincide with the real camera's projection center. The searching interval was non-uniform, depending upon the image resolution and the camera arrangement. The incremental voxels on the virtual rays are backward-projected to the cameras, searching for the best correspondence. The matching attributes were saved. A 3D region growing method was applied based on the characteristics of a matching curve with Continuity Constraint. This disclosed method lacks many features and benefits that are recognized for the present invention as will be discussed in the Description of the Preferred Embodiments section below.
One important general comment about most matching algorithms is that they perform matching and then try to eliminate false matches. After that they either choose the lowest error at each pixel as the best match or pick a sparse number of global minimums to interpolate the rest of the pixels with algorithms, such as Spline or Lagrance. These schemes can produce gross errors, even when many cameras are employed. In the real world, it is difficult to obtain a reliable global minimum. Multiple local minimums always exist in the searching process. The apparent global minimum can be deceiving, especially when a scene has rather homogeneous color. Even when a global minimum exists along a virtual ray, an estimated range can always be far away from it's true 3D location because the lowest error can fall into various homogeneous locations. The estimated range can vary widely from it's neighboring pixels.
Therefore a need exists to overcome the problems with the prior art as discussed above, and particularly for a robust and reliable method for an image processing system to reconstruct 3D image representation from multiple 2D image views.