A technique for use in production lines of factories or the like has been developed in recent years. The technique identifies an individual object from within a pile of objects using a vision system, measures a three-dimensional position and orientation of the identified object, and allows a hand attached to a robot to grasp the object.
As a technique for measuring a three-dimensional position and orientation of an object, there is a model fitting method that detects an approximate position and orientation of an individual object from a captured image of target objects, and fits a three-dimensional shape model of the object to image data by using the detected position and orientation as an initial value. With this method, if an appropriate position and orientation cannot be determined by the detection in the first stage, a correct position and orientation cannot be determined by the model fitting in the second stage. For example, in the case of an object whose front and back sides are similar in shape, assume that the position and orientation of the back side is erroneously detected in the detection described above during observation of the front side. In this case, model fitting using the detected position and orientation as an initial value converges to an incorrect solution, and a correct position and orientation cannot be calculated.
PTL 1 discloses a method which performs fitting from two orientations symmetrically placed and confusable with each other, and compares the results of the fitting to reduce erroneous recognition. Specifically, fitting is performed between a three-dimensional shape model and an image, and an axis which greatly contributes to orientation convergence (facilitates orientation convergence) in the fitting process is determined. Then, for a position and orientation calculated by the fitting, a position and orientation obtained by rotating the target object such that the orientation of the determined axis is reversed is generated as a new initial value to perform another fitting. The fitting results are compared, and the best result is selected to reduce erroneous recognition.
However, for an object whose two confusable positions and orientations are obtained by 180-degree rotation about an axis that facilitates orientation convergence (e.g., an object whose front and back sides are similar in shape, such as that illustrated in FIG. 2), a correct position and orientation cannot be determined by the method described in PTL 1.
When the method of PTL 1 is applied to this object, orientations obtained by reversing the axis in the longitudinal direction are generated as candidates. However, for the front side of the target object, if the position and orientation of the back side is determined in the detection in the first stage, a correct position and orientation cannot be calculated by fitting even when a new initial value is generated by the method of PTL 1.
Also, when an object has orientations confusable with each other, if each of the orientations converges in substantially the same manner in the fitting process, it is difficult to calculate the axis for the object, and thus is difficult to apply the method of PTL 1 to this object.