Techniques (stereo matching) have been proposed in which the same object is captured using imaging devices placed at different positions and, on the basis of a plurality of captured images, the corresponding points on the object in the respective images are searched for. If an image that is taken by one imaging device and serves as a basis is called a base image and images taken by the other imaging devices are called reference images, the stereo matching means it searches for regions in the reference images corresponding to a region of interest (a region including at least one pixel) in the base image.
In typical stereo matching, a region corresponding to the region of interest in the reference image is searched for by optimizing similarity degrees of image features and continuity of parallax in the whole of the image. Specifically, regions that resemble the region of interest in image feature in the reference image are searched for as candidate regions (optimization of the degree of similarity) and, out of the candidate regions, the candidate region, whose parallax with respect to a pixel (adjacent pixel) adjacent to the region of interest in the base image is continuous, selected as the region corresponding to the region of interest (optimization of continuity). Generally, a pixel value, an SSD (Sum of Squared Differences), or an SAD (Sum of Absolute Differences) is used as the image feature.
As a conventional technique, a method is known in which, when a plurality of points corresponding to a pixel (a pixel of interest) in the base image are retrieved, distance values (interest distance values) between the pixel of interest and the respective corresponding points are calculated, and the corresponding point is selected that minimizes a difference between the interest distance value and the distance values (adjacent distance values) of a plurality of pixels present around the pixel of interest (a first conventional technique). A technique is also known in which the continuity of parallax is not evaluated in a region having a high edge intensity so as not to evaluate the continuity of parallax across the boundary of the object (a second conventional technique). A technique may also be conceivable in which a region of the object is detected by performing segmentation of an image and the corresponding points are retrieved from within the detected region (a third conventional technique).
The first and the second conventional techniques, however, employ a method in which the corresponding point positions are optimized in the whole of an image. As a result, when the region of the object (an object region) in the image is small, the first and the second conventional techniques easily receive an influence of an error (error due to noises, for example) occurring in a background region having a larger region than that of the object region. This causes the corresponding point positions of the object region not to be correctly obtained in some cases.
In the third conventional technique, which employs a method in which the object region is detected by performing segmentation, it is hard to extract the object region in both of the base image and the reference image with high accuracy, thereby causing the corresponding point positions of the object region not to be correctly obtained in some cases.