1. Field of the Invention
The present invention relates to a recognition apparatus and method thereof, and a computer program and, more particularly, to a technique for recognizing a target object from an image captured by an image capturing apparatus in an environment which often suffers illumination variations at the times of learning and recognition.
2. Description of the Related Art
In recent years, the demand for robots, for example, to execute assembly jobs in factories has been increasing. Of such jobs, when robots handle job target objects, which do not have constant positions and orientations all the time, In these jobs, the use of a visual sensor is generally popular as a mechanism that is required to measure the positions and orientations of the target objects in a job when robots handle target objects of a job.
In order to cause the robots to perform, for example, more advanced assembly jobs, components that are to be assembled must be recognized by the visual sensor. Conventionally, studies have been made to recognize the types, positions and orientations of components by collating shape information such as CAD data of components with two- or three-dimensional information obtained by, for example, the visual sensor. As one of these recognition methods, the following extensive study has been made. That is, a computer learns feature amounts extracted from an image of a target object obtained by an image capturing apparatus, and recognizes the type of object included in an input image.
However, in case of the object recognition technique using an image, the recognition ratio is lowered when illumination variations occur during times of learning and recognition due to observations of specular reflection and glosses and changes in illumination direction and intensity on objects, or changes in positional relationships among an illumination, image capturing apparatus, and object.
Hence, studies have been made concerning those recognitions that cope with illumination variations during times of learning and recognition. As a method that is robust against illumination variations, a method which uses image features such as edges, which are used in learning and recognition and suffer less illumination variations is known. Alternatively, a method of obtaining a three-dimensional structure using, for example, a rangefinder, and simulating variations at the time of recognition, or a method using learning data including various illumination variations is known.
Japanese Patent Laid-Open No. 2008-65378 describes an arrangement for recognizing an image even when the illumination condition at the time of learning is different from that at the time of recognition. In the arrangement of Japanese Patent Laid-Open No. 2008-65378, the captured images of a target object to be recognized, which have been successfully recognized, are stored as registered images. During actual recognition, when recognition of a captured image has failed, one feature point of a target object to be recognized in a region that expresses the target object to be recognized in the captured image is detected. Then, a mapping function that represents a relationship between the pixel value of the detected feature point and those of feature points of registered images at the same position as the detected feature point is calculated, and pixel values of the captured image are corrected using the mapping function, thereby correcting illumination variations.
In case of a recognition technique that learns an identification function by extracting a certain feature amount from an image, and maps that feature amount on a feature space configured by feature amount vectors, when illumination variations occur between the times of learning and recognition, it is difficult to identify a target object with high precision.
In the arrangement of Japanese Patent Laid-Open No. 2008-65378, when recognition has failed, one feature point is detected, and a correction is executed based on a relationship between the pixel value of that feature point and those of feature points obtained at the time of learning. In order to calculate a mapping function used in correction, feature point correspondences have to be obtained. Japanese Patent Laid-Open No. 2008-65378 describes that feature point correspondences are obtained by extracting eyes, eyebrows, and flesh color regions of human faces or using markers. This method is effective when specific portions and feature points of target objects can be detected, and when corresponding points are uniquely determined based on markers. However, this method requires to find corresponding points used in correction, and cannot attain precise correction when the correspondence becomes indefinite or correspondence errors occur due to illumination variations.