The present invention is directed to a method and apparatus for providing depth estimation. More particularly, the present invention provides an adaptive stripe-based patch matching process for performing depth estimation.
It is known that it is possible to use multiple views of a given image field to provide depth information regarding the object or objects within the field. This technique is particularly useful in the art of robotics where it is necessary to estimate the depth of one or more objects in connection with controlling the operation of a robot. One technique for providing depth estimation is to use two cameras side-by-side that provide co-planar imaging planes, that is the imaging planes of the cameras are co-planar. The depth of objects within the field of view of the two cameras is determined by comparing the contents of the images provided by each camera and detecting, for example, displacements of the objects in one view relative to the other. It is also known that the imaging planes need not be co-planar so long as the image data is processed so as to create the effect that the imaging planes are co-planar.
FIGS. 1 and 2 provide examples of a two-image depth estimation system arrangement. In such a system two images, a left image 120 and a right image 110 can be captured with cameras to provide co-planar images. The left image, provided as input 220, and the right image, provided as input 210, are supplied to a processor 230 which operates in conjunction with the depth estimation module 240 to calculate the depth based on the information in the images provided by in the cameras.
In these systems, locating the same point in multi-viewed images is one of the most difficult aspects of developing computational algorithms for stereopsis. The majority of the stereo matching approaches can be classified as either block-based or feature-based methods. In block-based methods, matching is done based on an intensity pattern similarity of a block around a given point in a field of view. The aim is to obtain correspondences for every image point. By contrast, the feature-based methods assign disparity, which is a lateral difference between a matched point, but only to feature points such as corner points, edges or zero crossings. Most stereo algorithms, however, tend to produce error due to noise, low feature content, depth discontinuities, occlusion, and photometric differences for example.
It is well known that for certain types of matching primitives, finding correspondence is an ill-posed problem. To regularize this problem a smoothness constraint on the disparity field is generally incorporated, although smoothing over disparity boundaries is physically incorrect. Previous research on block-based techniques showed that the matching window size must be large enough to include sufficient intensity variation for matching, but the size must be small enough to avoid the effect of projective distortion. If the window is too small, or doesn""t cover enough intensity variation, the estimation is poor because of low intensity to noise ratio. Conversely, if the window is too large, the estimated depth may not represent correct matching because of over averaging of the spacial information.
It would be beneficial if a technique could be provided which would improve the analysis of co-planar images for the purposes of performing depth estimation calculations.
The present invention provides a method and an apparatus for facilitating the analysis of multiple images where comparison of the images can be used to provide depth estimation of objects within the images. In particular, in accordance with an embodiment of the present invention, one of the images is divided into a mesh using an adaptive stripe based patch determination technique. In particular, in accordance with an embodiment of the present invention a mesh representative of an image is generated by dividing the image into a plurality of horizontal stripe regions and then, within each horizontal stripe region, selecting at least one line segment extending from a top boundary of the region to a bottom boundary of the region from a plurality of line segments wherein the selection is made with reference to a relative power of the plurality of line segments. In accordance with the present invention a mesh or patch pattern of an image is created whereby the image is divided into a plurality of trapezoidal patch regions.
In accordance with yet another embodiment of the present invention, once a mesh representation of a first image is created then a second image, can be analyzed to find regions in the second image that correspond to trapezoidal regions in the mesh generated concerning the first image. Then a disparity between corresponding regions of the two images is detected. This disparity information can then be utilized in connection with determining a depth of an object appearing in the two images.
The two images of interest can be co-planar, but need not be. It is advantageous if the normal lines of the imaging planes of the two images are parallel.
Other advantages will be clear from a description of the details of the invention which follow.