Object recognition methods hitherto known include a feature extraction using Karhunen-Loeve transform, and similar methods. For example, “Visual Learning and Recognition of 3-D Objects from Appearance” by H. Murase and S. K. Nayer (International Journal of Computer Vision, 14, 1995), Japanese Laid-open Patent No. 8-271223, and Japanese Laid-open Patent No. 9-53915 are known.
A conventional object recognition apparatus is explained by referring to a drawing. In FIG. 22, a conventional object recognition apparatus comprises an image input unit 11 such as a camera for entering an image, a learning model memory unit 13 for preparing and storing local models of target object for recognitions from learning images, a feature extractor 12 for extracting the feature of an input image, a learning feature memory unit 14 for storing the feature (learning feature) of the model, a matching processor 15 for matching the feature of the input image with the feature of each model, and an object type estimator 16 for judging and issuing the type of the target object for recognition in the input image. Herein, the type refers to the individual or the kind.
The operation is described below. When an input image including a target object for recognition is entered in the feature extractor 12 through the image input unit 11, the feature extractor 12 extracts a feature from the input image, and issues the feature to the matching processor 15. The matching processor 15 sequentially searches the models from the learning model memory unit 13, and selects the learning feature from the learning feature memory unit 14. The similarity measure between the input image feature and the learning feature is calculated, and is issued to the object type estimator 16. Thus, the matching processor 15 repeats the procedure of similarity measure calculation and output by using the model of the learning model memory unit 13. When the similarity measure is the maximum, the object type estimator 16 determines to which type of models the target object for recognition included in the input image belongs.
The input image is overlapped with various learning images, and the overlapping degree is judged by using the similarity measure, and therefore the object equal to the learning image can be recognized, but when an object not being learned is included in the input image, it is difficult to estimate and recognize the object.
Or when recognizing the same object as the learning image, it was difficult to recognize if there is no information about the distance to the existing position of the object. To obtain the distance information by the imaging device only, a three-dimensional camera is needed, but the signal processing is complicated.