1. Field of the Invention
The present invention relates to image processing of generating a training image for generating a dictionary to be used in image recognition processing of detecting an object from an input image.
2. Description of the Related Art
Various kinds of research and development have been carried out for image recognition of detecting the image of an object to be detected from an image obtained by capturing objects. The image recognition technique is applied to various fields and used for many actual problems of, for example, face recognition and part recognition in a factory.
This image recognition can be considered from the viewpoint of pattern recognition. In the pattern recognition as well, research has been conducted on classifiers, that is, how to perform classification of input information. There have been proposed various methods such as a neural network, support vector machine (SVM), and randomized trees (RT).
In these methods, a dictionary for image recognition needs to be generated. When generating the dictionary, a training image is necessary. As for image recognition by recent industrial robots, there is also a need to recognize an object with a high degree of freedom of the three-dimensional orientation, such as part picking of detecting a desired part from a plurality of kinds of piled parts. Detection of a three-dimensional orientation requires training images corresponding to various orientations of an object.
In image recognition aiming at part picking by a robot and the like, orientation information of an object is very important. An orientation corresponding to a training image is expressed by a parameter such as Euler angles or a quaternion. It is, however, difficult to prepare the photographed image of an object in such an orientation as a training image. In general, therefore, a computer graphics (CG) image in an arbitrary orientation is generated by computer-aided design (CAD) and used as a training image.
The method of generating a training image by CAD generally handles the joints of a polygon of CAD data as edges, and generates a binary edge image. In object detection processing, edge extraction processing is performed for the photographed image of parts, and edge-based matching is executed to identify the position and orientation of an object. In this method, the result of edge extraction processing on a photographed image greatly influences the object detection performance. Generally, edge extraction processing greatly varies depending on the material of an object, the influence of ambient light, and the like, and requires very cumbersome adjustment by an operator.
In contrast, a method of generating a training image close to a photographed image by rendering is also used. In this method, it is necessary to estimate the luminance value of each surface of an object. If the bidirectional reflectance distribution function (BRDF) of an object and the state of ambient light are known, a luminance value estimated using them can be given to an object surface to generate a CG image. However, measurement by special equipment is necessary to accurately know the BRDF of an object. In addition, work for accurately acquiring an ambient light condition in an actual environment as a numerical value is required.
There is also a method of generating a training image by performing environment mapping in which a sphere is arranged in an environment. For example, to generate the training image of a mirror object, texture mapping of the image (environment map) of an ambient environment is performed for the mirror sphere arranged in the environment, thereby generating an image. However, for an object made of plastic or the like, even if the material is the same, its reflection characteristic varies depending on the mold or the surface treatment. It is therefore difficult to prepare a sphere having the same reflection characteristic as that of the object.