1. Field of the Invention
The present invention relates to an image recognition method which identifies an object by searching the position and orientation of the object included in the image data.
2. Description of the Related Art
Various image recognition methods for identifying a specific object from image data have been known, but a general method is storing multi-grayscale image data obtained by shooting with a CCD camera (hereafter raw image data) in a memory and using binary images of this raw image data for identifying an object from background images other than the object. In the case of CCD where 1 pixel has 256 grayscales of resolution, for example image light to be input to CCD is sampled as 256 grayscale digital signals, then this sample is compared with a threshold value, and is converted into one of two types of digital signals (e.g. xe2x80x9c0xe2x80x9d or xe2x80x9c1xe2x80x9d).
The image recognition method using binary images, however, has the following drawbacks.
When a raw image includes a shadow of an object and mirror reflection light, the shadow of the object and the mirror reflection light are amplified to be noise during binarizing processing depending on the setting of the threshold value, which may cause object recognition error. For example, when a square object placed on a workbench is shot by a CCD camera (128xc3x97128 pixels) as shown in FIG. 23, the raw image has a luminance value distribution shown in the three-dimensional graph in FIG. 24. In the case of the raw image shown in FIG. 24, the difference in the luminance value distribution in the object, a wall disposed at the lower right of the object and the workbench is clearly shown. If the raw image shown in FIG. 23 is binarized, however, the binary image becomes like the one in FIG. 25, where the luminance value of the wall is amplified to the same level as the object, becoming noise, and it is difficult to identify the wall and the object.
Image recognition methods used for removing the noises are, a method of performing fine line processing on a binary image, or a method of recognizing the end points, branching points and length of a skeletal line, and crossing points and angle between skeletal lines, after performing fine line processing on a binary image so that geometric features of the object are extracted in order to identify the object. The problem of these types of methods is that the processing time is lengthy. In order to decrease the influence of the above mentioned noises, it is also possible to set two types of level values in advance, and to binarize only the image signals between these level values, however this method is inappropriate for image processing which requires real-time processing, since it is difficult to select the level values and processing time is long. In the case of a visual servo-mechanism using a feedback loop, for example, if the above mentioned processing methods are used, it takes too much time for the recognition processing of an object, and when the recognition processing of a moving object at a certain position is completed, the moving object has already moved to a distance position, which makes it difficult to trace the moving object.
Also, in the case of the above mentioned processing methods, the threshold level which was set becomes appropriate while the lighting conditions on the mobile object and background image change as the object moves, so a recognition error of the object occurs. In such a case, an optimum threshold value must be calculated again along with the conditional changes, but deciding on an optimum threshold value is difficult, and the processing time tends to be long.
Also, when a CCD image sensing device is used, the characteristics of the CCD change according to the ambient temperature environment, that is, the luminance value of the output image changes due to the change of the stored charge amount according to the light receiving amount of CDD, so a threshold value once set may become inappropriate. In such a case, the threshold value must be calculated again, and just like the above case, deciding on an optimum threshold value is difficult, and the processing time tends to be long.
With the foregoing in view, it is an object of the present invention to provide an image recognition method with a high real-time characteristic where (1) the position and orientation information of an object is accurately determined, regardless the noise and change of luminance value of the object caused by changes of the ambient environment, such as illumination conditions, without executing binary processing, and (2) the object can be accurately recognized at very high-speed.
To achieve the above object, the inventors energetically researched focusing on a genetic algorithm method (hereafter xe2x80x9cGA methodxe2x80x9d) while performing research on an image processing method using raw image data without involving binary processing, and reached the present invention.
The image recognition method in accordance with the present invention comprises the steps of storing search models, where all or a part of shape and luminance value distribution of an object are modeled as image data in advance, obtaining input image data including the object, distributing the search models in the area of the input image data, and assigning the position and orientation information to each one of the search models as individual information, determining a function value to indicate correlation with the object for each one of the search models, and searching the solution of at least the position information of the object from the above input image data by evolving the search models using a genetic algorithm based on the above function values.
In other words, search models which overlap with the objects are generated by preparing the search models in advance and distributing the plurality of search models in the area of the above input image data. Here, a function value to evaluate the correlation between the object included in the above input image data and the search model (in the genetic algorithm, this function value is called the xe2x80x9cgoodness-of-fitxe2x80x9d. This phrase is used hereafter) is determined, the degree to which the search model overlaps with the object is evaluated, the search models are evolved using the GA method based on this goodness-of-fit, and a solution on the position information and orientation information of the selected search models are regarded as the position information and orientation information of this object, in order to recognize this object. Since image processing is possible using multi-grayscale input image data itself, the amplification of noise caused by deviation of the threshold value from the optimum value, along with the change of the luminance value during binary processing, which occurs in the case of the conventional method, does not occur, and image processing can be executed smoothly, which makes it possible to accurately recognize the position information and orientation information of the object.
When a plurality of search models are distributed in the area of the above input image data, it is preferable to distribute the search models randomly by generating the position information and orientation information of each search model. By this method, all input image data can be efficiently searched.
For the above function value, the sum of values determined from the luminance value of the above input image data in the internal area of the search model can be used. By this method, the function value can be determined using simple and quick calculation.
It is also preferable to assign constraints for the evolution of the position and orientation information of the above search models, in order to limit the search range of the object included in the above input image data. By this method, a local search can be very efficiently executed, so that the recognition processing time for the object can be decreased considerably.
It is also preferable to regard a search model having the highest function value at a desired control timing as the optimum solution without waiting for the conversion of evolution, since the object can be recognized at ultra-high-speed and the moving object can be traced.