1. Field of the Invention
The present invention relates to a technique for identifying a target object at high accuracy.
2. Description of the Related Art
As one identification method, a study of controlling a computer to learn feature amounts extracted from an image of a target object obtained from an image capturing device, and identifying a type of an object which appears in the input image has been extensively made. Also, a study of simultaneously estimating a position and orientation of an object simultaneously with its type using model information or the like of the object has been made. As an application destination of that technique, position/orientation identification (recognition) of parts, which controls a robot to execute operations such as advanced assembling, is known.
Non-patent literature 1 (B. Leibe, “Robust Object Detection with Interleaved Categorization and Segmentation”, IJCV Special Issue on Learning for Vision for learning, August 2007.) has proposed a method of estimating a central position of an object by probabilistic voting by associating features which are code-booked from learning images and detected features with each other (implicit-shape-model). With this method, not only a type but also a position of an object can be estimated.
In patent literature 1 (Japanese Patent Laid-Open No. 2008-257649), feature points are extracted from an input image to calculate their feature amounts, and feature points similar to feature amounts in learning images are set as corresponding points. Then, by voting to reference points based on feature amounts (including position information) of feature points of learning images for respective corresponding points in an input image, a target object is identified and a position is estimated.
Also, a technique for speeding up processing and enhancing its accuracy when information about states of target objects such as a pile of parts is acquired using a sensor such as a camera, positions and orientations of respective target objects are estimated from the acquired information such as an image, and target objects are sequentially picked up by gripping them by a robot has been studied.
Patent literature 2 (Japanese Patent No. 4238256) has proposed a method of generating a virtual pile, and simulating robot operations based on a virtual captured image of that pile. A pile state is assumed by randomly generating orientations of a plurality of target objects using model data such as CAD data of target objects, and operations for handling a target object by a robot are simulated.
In patent literature 3 (Japanese Patent No. 3300092), values which can be assumed by parameters that define a position and orientation of a target object, are stochastically predicted, a region (ROI) where features that define the target object exist on the screen is limited or that where the features exist on a parameter space is limited according to the prediction result.
Patent literature 4 (Japanese Patent Laid-Open No. 2007-245283) shortens a processing time by selecting an orientation of a work from those limited to a plurality of stable orientations upon estimation of the orientation of the work. Patent literature 5 (Japanese Patent Laid-Open No. 2010-186219) shortens a processing time by calculating degrees of stability for respective orientations of a work, and inhibiting use of templates which express orientations of low degrees of stability.
When positions and orientations of respective target objects such as a pile of parts are to be estimated, since the positions and orientations of the target objects have variations, feature amounts obtained from images obtained by viewing the target objects from various viewpoints have to be learned. However, it is often difficult to identify an image of the target objects obtained at a certain viewpoint, and it is difficult to raise identification accuracies of images of the target objects obtained at all the viewpoints.
The reason why identification accuracies of respective positions and respective orientations are different depending on target objects is that a feature portion helpful to identify a target object cannot always be obtained on an image when images of that target object are captured from all viewpoints. For this reason, when the positions and orientations of target objects such as a pile of parts in a factory or the like are to be estimated, it is required to determine the position and orientation of a camera so as to improve the identification accuracies.
In patent literature 2, CG models of target objects are virtually piled up so as to simulate teaching of robot operations. However, patent literature 2 does not include any description that improves the identification accuracies.
In patent literature 3, the position and orientation of a target object are stochastically predicted. However, patent literature 3 does not include any description that determines the position and orientation of a camera so as to improve the identification accuracy.
In patent literature 4, the orientation of a target object to be estimated is limited to those around stable orientations. However, patent literature 4 does not include any description which determines the position and orientation of a camera so as to improve the identification accuracy.
In patent literature 5, templates are generated using degrees of stability of orientations, but the degrees of stability are not used to obtain an accurate estimation result of an orientation. That is, patent literature 5 does not consider a reduction of estimation errors of an orientation using degrees of stability in estimation of the orientation.