As part of reconstructing three dimensional (3D) information from a plurality of two dimensional (2D) image views of a scene, an image processing system typically first finds and matches corresponding pixels across the plurality of 2D image views. To recover 3D information from a given number of 2D images, an essential step is to find/match the corresponding pixels to each pixel in a view. These corresponding pixels typically are the resulting projection of a voxel back into these 2D image views.
A common way of finding these corresponding pixels for each pixel in a view is to optimize and match a matching objective function. Matching is a similarity measurement of the corresponding pixels. The lower the matching error is, the more similar the corresponding pixels are. Nevertheless, matching algorithms often find many different sets of corresponding pixels to be potential matches.
As illustrated in FIG. 1, these potential matches, when shown on a matching curve 106, are indicated by local minimums 108, which are below a predefined threshold value, along a search path on the curve from zero outwardly following the X-axis 104. The search path translates into 3D depth values for a pixel's matching curve measured along the X-axis. The Y-axis 102 translates into matching error values for the pixel's matching curve measured along the Y-axis. A matching curve is associated with each pixel/feature in a view to be corresponded across the plurality of views.
A definite depth value for a pixel is difficult to be determined at the presence of multiple local minima 108. The goal of optimization algorithms has been to reduce these local minimums 108 into a global minimum 110. Still, a global minimum 110 does not guarantee the correct depth for a pixel. This is especially true when the scene has a large region without texture.
In reality, both types of matching curves, i.e., with local minima and with global minimum, always co-exist within the pixels of a view. More often a matching curve yields local minima rather than a global minimum.
Recent work by the present inventor, was disclosed in a publication entitled “3-D Visual Modeling and Virtual View Synthesis: A Synergistic, Range-Space Stereo Approach Using Omni-Directional Images”, in November 2000. The published approach discusses making use of characteristics of matching curves for robust matching. The disclosed method searched, matched, and rendered directly in 3D space along virtual space, starting from a given virtual viewpoint. A virtual viewpoint is a viewpoint that does not coincide with the real camera's projection center. The searching interval was non-uniform, depending upon the image resolution and the camera arrangement. The incremental voxels on the virtual rays are backward-projected to the cameras, searching for the best correspondence. The matching attributes were saved. A 3D region growing method was applied based on the characteristics of a matching curve with Continuity Constraint. This disclosed method lacks many features and benefits that are recognized for the present invention as will be discussed in the Description of the Preferred Embodiments section below.
In particular, color homogeneity in a scene produces a broader envelope with a local/global minimum. The envelope is the distance between the first time a matching error lower than the set threshold and the first time a matching error is higher than the set threshold. When the envelope of a local/global minimum is broad, the determined depth can fall into various 3D locations within it. This causes irregularity in the recovered 3D structure even when the 3D information is correctly identified to be within a local minimum out of the many local minimums. In other words, the 3D reconstruction process structural noise that the recovered 3D information is conformed to their neighboring pixels and they form as a patch, but the patch is not consistent with other similar neighboring patches as a whole. Unfortunately, the irregularity alters the reality of the 3D structure in the scene. This is especially noticeable at the regions with homogeneous color. Common ways of smoothing out these irregularities are to perform filtering or interpolation. The filtering approach can be such as medium filtering or low-pass filtering, while the interpolation can be such as Spline or Lagrance. However, these methods rely only on the information of the recovered 3D data and disregard the available information of the matching curves where the 3D data were originally derived from. The drawback of these filtering and interpolation approaches is that when the recovered 3D information has significant structure errors, they become incapable of removing this irregularity although they might deal well with statistical random noise.
Therefore a need exists to overcome the problems with the prior art as discussed above, and particularly for an improved method for an image processing system to remove and smooth out irregularities from a 3D image representation from multiple 2D image views.