1. Field of the Invention
The present invention relates in general to computer vision and more particularly to a method and a system for progressive stereo matching of digital images that represent a scene.
2. Related Art
Stereo matching of digital images is widely used in many computer vision applications (such as, for example, fast object modeling and prototyping for computer-aided drafting (CAD), object segmentation and detection for human-computer interaction (HCI), video compression, and visual surveillance) to provide three-dimensional (3-D) depth information. Stereo matching obtains images of a scene from two or more cameras positioned at different locations and orientations in the scene. These digital images are obtained from each camera at approximately the same time and points in each of the image are matched corresponding to a 3-D point in space. In general, points from different images are matched by searching a portion of the images and using constraints (such as an epipolar constraint) to correlate a point in one image to a point in another image.
Several types of matching techniques exist and generally may be classified into two broad categories: feature matching and template matching. Feature matching extracts salient features from the images (such as corners and edge segments) and matches these features across two or more views. One disadvantage of feature matching, however, is that only a small subset of image pixels is used to match features. This means that if image pixels used in the matching process are unreliable then only a coarse and inaccurate 3-D representation of the actual scene is produced. Template matching uses an assumption that portions of images have some similarity and attempts to correlate these similarities across views. Although this assumption may be valid for relatively textured portions of an image and for image pairs having only small differences, the assumption may lead to unreliable matching at occlusion boundaries and within featureless regions of an image. In addition, template matching yields many false matches due in part to the unreliability of matched image pixels.
These and other existing stereo matching techniques have the disadvantage of having difficulty specifying an appropriate search range and being unable to adapt the search range depending on the observed scene structure. Existing dense matching techniques use the same search range (such as a small search window) for the entire image and thus may yield many false matches. In addition, some matching techniques consider only one match at a time and propagate the match in a small area, which also has the disadvantage of yield many false matches.
Accordingly, what is needed is a method and system for stereo matching that is capable of reducing false matches by adapting a search range depending on the scene structure. Further, this method and system would start with reliable pixel matches and progressively add reliable and unambiguous matches. What is also needed is a method and system for stereo matching that uses an adaptable search range that is dynamically determined by unambiguous pixel matches. Whatever the merits of the above-mentioned systems and methods, they do not achieve the benefits of the present invention.