1. Field of the Invention
The present invention relates to a matching point extracting method, in determining distance information on an object based on plural images of such an object taken from mutually different positions, for extracting mutually matching points in such images.
2. Related Background Art
Such an apparatus for extracting the matching points of the images has conventionally been utilized, for example, in producing a control program for a moving robot or cutting out the image data of a predetermined object from image data of a phototaken image.
In such a matching point extracting apparatus, for determining the distance to the object in the image, there are known, for example, a method shown in FIG. 1A of employing two or more cameras 133a, 133b for taking the images of the object 131 from different viewing points and utilizing the parallax between the images obtained by such cameras 133a, 133b, and a method shown in FIG. 1B of employing one camera 134, and moving such a camera 134 for taking the images of the object 132 from different viewing points and utilizing the parallax between the time-sequentially obtained images. In these methods shown in FIGS. 1A and 1B, the extraction of the matching points is executed for obtaining the parallax required in the determination of the object distance.
More specifically, in the method shown in FIG. 1A, the matching points are determined in all the points (pixels) in the image data of the image obtained by the left-side camera 133a and in those of the image obtained in synchronization by the right-side camera 133b. Also in the method shown in FIG. 1B, the matching points are determined in all the points (pixels) in the image data of an image obtained at a certain position and in those of another image obtained at a position after movement. The determination is made in all the pixels (points) of the image data because the area of the object within the image is not known.
Among the representative methods for such a matching point extraction, there is known a template matching method. In this method, around a point P for which a matching point is to be determined, in FIG. 2A, a template (template image) 143 is prepared as shown in FIG. 2A and is moved over the entire area of a searched image 142 as shown in FIG. 2B, with a calculation of the similarity at every point and a point showing the highest similarity is determined as the matching point.
As the evaluation function for determining the similarity, there can be employed a following function (1) utilizing the difference in the luminance (pixel values) or a following function (2) utilizing the correlation of the luminance (pixel values):                               E          ⁡                      (                          x              ,              y                        )                          =                              ∑            i                    ⁢                      xe2x80x83                    ⁢                                    ∑              j                        ⁢                                          [                                                      F                    ⁡                                          (                                              i                        ,                        j                                            )                                                        -                                      A                    ⁡                                          (                                                                        i                          -                          x                                                ,                                                  j                          -                          y                                                                    )                                                                      ]                            2                                                          (        1        )                                          σ          ⁡                      (                          x              ,              y                        )                          =                                            ∑              i                        ⁢                          xe2x80x83                        ⁢                                          ∑                j                            ⁢                              {                                                      (                                          i                      ,                      j                                        )                                    -                                      A                    ⁡                                          (                                                                        i                          -                          x                                                ,                                                  j                          -                          y                                                                    )                                                                      }                                                                                                          ∑                  i                                ⁢                                  xe2x80x83                                ⁢                                                      ∑                    j                                    ⁢                                                            F                      2                                        ⁡                                          (                                              i                        ,                        j                                            )                                                                                            ⁢                          xe2x80x83                        ·                          xe2x80x83                        ⁢                                                            ∑                  i                                ⁢                                  xe2x80x83                                ⁢                                                      ∑                    j                                    ⁢                                                            A                      2                                        ⁡                                          (                                              i                        ,                        j                                            )                                                                                                                              (        2        )            
In the foregoing equations (1) and (2), F(i, j) indicates the luminance value of the pixel at a coordinate (i, j) on the template image, and A(i, j) indicates the luminance value of the pixel at a coordinate (i-x, i-y) on the image to be searched (hereinafter, called a search image). Thus, the function (1) or (2) represents the similarity when the position of the template image is moved by (x, y) on the search image.
With the function (1), the matching point is indicated at a point where E(x, y) reaches a minimum, which can theoretically reach 0 at a minimum. With the function (2), the matching point is indicated at a point where "sgr"(x, y) reaches a maximum, which can theoretically reach 1 at a maximum.
For determining the similarity there is also known the following other evaluation function (3). It determines the number of pixels, within the entire template, for which the difference between the luminance value (pixel value) of such a pixel and the luminance value of a pixel in the search image is within a predetermined threshold value xcex5, namely it determines the numbers of the pixels in the template, having differences in the luminance value within a predetermined value from each pixel of the search image, and a point on the search images showing the largest number of such pixels is defined as the matching point to the center point of the template:                     kij        =                  {                                                                                                                1                      ⁢                                              (                                                                              "LeftDoubleBracketingBar"                                                                                          F                                ⁡                                                                  (                                                                      i                                    ,                                    j                                                                    )                                                                                            -                                                              A                                ⁡                                                                  (                                                                                                            i                                      -                                      x                                                                        ,                                                                          j                                      -                                      y                                                                                                        )                                                                                                                      "RightDoubleBracketingBar"                                                    ⁢                                                      xe2x80x83                                                     less than                                                       xe2x80x83                                                    ⁢                          ϵ                                                )                                                                                                                                                        0                      ⁢                                              (                                                                              "LeftDoubleBracketingBar"                                                                                          F                                ⁡                                                                  (                                                                      i                                    ,                                    j                                                                    )                                                                                            -                                                              A                                ⁡                                                                  (                                                                                                            i                                      -                                      x                                                                        ,                                                                          j                                      -                                      y                                                                                                        )                                                                                                                      "RightDoubleBracketingBar"                                                    ⁢                                                      xe2x80x83                                                    ≥                                                      xe2x80x83                                                    ⁢                          ϵ                                                )                                                                                                        ⁢                              
                            ⁢                              C                ⁡                                  (                                      x                    ,                    y                                    )                                                      =                                          ∑                i                            ⁢                              xe2x80x83                            ⁢                                                ∑                  j                                ⁢                kij                                                                        (        3        )            
With this function (3), the matching point is indicated at a point where C(x, y) reaches a maximum, which can theoretically reach the number of all the pixels in the template at a maximum.
Conventionally, the object distance is determined from the images, by extracting the matching points from the images, determining the amount of parallax in each point of the image, and effecting trigonometry based on such amount of parallax, focal length and positional information of the camera.
Such a template matching method, however, in extracting the matching points between the images that have recorded an object at a finite distance from the cameras, has been associated with a drawback that the precision of extraction becomes significantly deteriorated at a certain portion of the object.
Such a drawback will be explained with reference to FIGS. 3A to 3C, which show images having recorded a car as the main object, by the two cameras 133a, 133b shown in FIG. 1A, wherein FIG. 3A shows an image obtained by the left-side camera 133a while FIGS. 3B and 3C show images obtained by the right-side camera 133b. For the convenience of explanation, the car is considered as the main object while the house and the tree are considered as the background and are assumed to be distant from the car and positioned in the back of the scene.
Now, there is considered a case of cutting out a template (template image) from the image shown in FIG. 3A, taken as a reference image. In this operation, a certain size is required for the template.
This is because the area of a point A in the sky or a point B in the door of the car in the image shown in FIG. 3A shows little change in the luminance, lacking conspicuous features and having similar luminance values therearound. Also, the area of superposed leaves of the tree at a point C has the area of similar luminance distribution therearound, and, for the extraction of exact matching points in such areas A, B and C, the template needs a certain size for detecting the variation in the distribution of the luminance.
For example, if the template is defined in the illustrated size around the point B in FIG. 3A, the template 151 contains a type in addition to the door of the car, thus showing a large variation in the luminance and allowing an exact determination of the matching point.
Now, let us consider a point D at an end of the car, which is the main object in the image shown in FIG. 3A. If the template 152 is prepared around the point D, with a same size as around the point B, the template 152 includes the house in the background, in addition to the car constituting the main object. The matching point extraction with such a template 512 results as shown in FIG. 3C, in an erroneous matching point F instead of the correct matching point E.
Such an erroneous matching results from a fact that, in the calculation of the evaluation function (similarity) in the example shown in FIGS. 3A to 3C, with respect to a point E of the search image, an area corresponding to the car in the template image shows a high similarity because the image remains the same in the template image and in the search image and an area corresponding to the background of the template image shows a low similarity because of the change in the image, whereby the point E is judged as not the matching point, whereas, with respect to a point F of the search image, the area corresponding to the background of the template image shows a high similarity because the image remains the same in the template image and the search image and the area corresponding to the car of the template image also shows a high similarity because of an eventually high proportion of the area of similar luminance values despite the change in the image, whereby the point F is finally judged as the matching point.
Such an erroneous matching is a phenomenon generally resulting from a fact that the car constituting the main object is positioned in front of the house and the tree constituting the background, thus providing a difference in the parallax. Thus, in the case of taking an object at a finite distance as the main object, the accuracy of the matching point extraction is deteriorated in an area close to the boundary between the main object and the background, or, if the main object contains a portion showing a significant change in the object distance, in an area close to the boundary of such a portion.
As explained in the foregoing, in employing the conventional template matching method for the matching point extraction, the template is required to have a certain size, but the template of such a certain size results in a deterioration in the accuracy of matching point extraction in an area close to the boundary between the main object and the background.
On the other hand, the evaluation function (3) is effective, to a certain extent, for preventing the above-mentioned deterioration of the accuracy of the matching point extraction in the area close to the boundary between the main object and the background.
However, for example, when the background occupies a major portion of the template, the number of pixels in the template showing differences in luminance within the threshold value xcex5 from a pixel in the search image becomes larger in a portion corresponding to the main object in the template image than in a portion corresponding to the background, whereby the accuracy of matching point extraction becomes deteriorated.
Another drawback lies in a fact that an occlusion area, which exists in the images prepared as shown in FIGS. 1A and 1B and which does not have the matching point, is difficult to detect.
Such a drawback will be explained with reference to FIGS. 4A and 4B, showing images obtained by the two cameras 133a, 133b shown in FIG. 1A.
As an example, the matching point for a point I in the image shown in FIG. 4A is hidden by the car in the image shown in FIG. 4B and does not exist, therefore, in this image. Such an area is called an occlusion area, and does not contain the matching point, so that the matching point extraction should be inhibited even if the similarity is evaluated as high.
For identifying such an occlusion area, there is known a method of at first preparing a template image around a point in the reference image, then searching, on the search image, a candidate matching point for a certain point (center point of the template image) of the reference image, preparing a new template image around the thus obtained candidate matching point in the search image, and searching a candidate matching point on the reference image which is now taken as a search image, utilizing the thus prepared new template image around the first-mentioned candidate matching point. If the candidate matching point searched in the reference image coincides with the center point of the initially prepared template, the candidate matching point in the search image is identified as the matching point for the center point of the initially prepared template image, but, in the case of absence of coincidence, it is identified that the matching point does not exist.
However, such a method has a drawback of possibility of erroneous matching, for example, in the case a point J in the reference image shown in FIG. 4A erroneously extracts a candidate matching point K in the search image shown in FIG. 4B because of a window in the background and the template image prepared around the point K again erroneously extracts the point J in the reference image as a candidate matching point, namely in the case of doubled erroneous extractions, or in case a point L in the reference image shown in FIG. 4A correctly extracts a point M in the search image in FIG. 4B, but such a point M erroneously extracts a candidate matching point N in the reference image shown in FIG. 4A because of the tree and the ground in the background.
An object of the present invention is to provide a matching point extracting method capable of resolving the above-mentioned drawbacks, and an apparatus therefor.
Another object of the present invention is to provide a matching point extracting method, enabling highly precise matching point extraction in the entire area of the image to be used, without distinction of the object and the background therein, and an apparatus therefor.
The above-mentioned objects can be attained, according to an embodiment of the present invention, by a matching point extracting method and an apparatus therefor for determining matching points between plural images based on the template matching method, comprising:
a template formation step of preparing mutually different plural templates from an arbitrary one of the plural images; and
a matching point determination step of determining the matching points of the plural images, utilizing, in an image area showing a specified condition, one of the plural templates prepared in the template formation step.
Still another object of the present invention is to provide a matching point extracting method, enabling matching point extraction within a short time in the entire area of the image to be used, without distinction of the object and the background therein, and an apparatus therefor.
The above-mentioned object can be attained, according to an embodiment of the present invention, by a matching point extracting method and an apparatus therefor for determining matching points between plural images based on the template matching method, comprising:
a first template formation step of preparing an initial template from one of plural input images;
a similarity operation step of determining a similarity utilizing the pixel value of each pixel in the template prepared in the first template formation step and the pixel value of each pixel in another of the input images;
a decision step of judging a specified condition based on the similarity determined in the similarity operation step; and
a second template formation step of preparing another template based on the result of a judgment in the judgment step.
Still another object of the present invention is to provide a matching point extracting method and an apparatus therefor capable of precisely determining an occlusion area or an area containing a large change in the distance in the image to be used, thereby enabling highly precise matching point extraction.
The above-mentioned object can be attained, according to an embodiment of the present invention, by a method for extracting, in synthesizing plural images, matching points of such plural images and an apparatus therefor, comprising:
a template image data formation step of entering plural sets of image data each corresponding to an image of an image frame, taking an image represented by the image data corresponding to an image frame among the thus entered image data of plural image frames as a reference image, extracting the image data, corresponding to a partial area in such a reference image, as a template image and outputting the template image data corresponding to the thus extracted template image;
a matching candidate extracting step of comparing the values of the pixel data constituting the template image data outputted by the template image data formation step with the pixel data constituting the image data corresponding to each of search images which are the images, other than the reference image, of the image data of the plural image frames entered in the template image data formation step, determining the number of pixel data showing differences smaller than a predetermined value, calculating the similarity of such a search image to the template image according to such a number of pixel data, extracting the search image showing a maximum similarity as a matching candidate for such a template image, and outputting such a matching candidate together with such a similarity;
a decision step of judging, based on the similarity data outputted by the matching candidate extraction step, whether the pixel data constituting the template image satisfy a predetermined condition; and
a template image changing step of, when the decision step identifies that the pixel data constituting the template image satisfy such a specified condition, changing the shape of the template image represented by the template image data outputted by the template image formation step and outputting such a changed template image to the matching candidate extraction step.
Still other objects of the present invention, and the features thereof, will become fully apparent from the following detailed description of the embodiments, to be taken in conjunction with the attached drawings.