A complicated task which has been performed until now by a human such as an assembly of manufactured products is being performed by a robot instead along with the development of a robotic technique in recent years. Such a robot grasps a component with an end effector such as a hand and performs assembly. In order that the robot grasps the component, it is necessary to accurately estimate a relative position and orientation between the component to be grasped and the robot (a hand). Such an estimation of position and orientation is variously used for the robot not only to grasp the component, but to estimate its own position to autonomously move and to register a real space in augmented reality with a virtual object.
A method for estimating the position and orientation includes the one using a two-dimensional image captured by a camera or a range image acquired from a distance sensor. Among other things, an estimation using a model fitting in which a three-dimensional geometric model of an object is fitted to an image feature extracted from a captured image or point cloud data acquired from a range image is generally used. For example, there is a method for estimating the position and orientation of an object so that a projection image of a wire frame model of an object is fitted to an edge detected on a gray-scale image. Furthermore, there is another method for estimating the position and orientation of an object by fitting a three-dimensional geometric model such as a mesh model to point cloud data acquired from a range image.
In general, a shape and size of a mass-produced industrial component disperses due to a problem of machining accuracy and cost. It is unrealistic to produce a three-dimensional geometric model for each individual of such a component, so that the position and orientation are generally estimated by using one three-dimensional geometric model representing a standard shape of a component. In other words, the three-dimensional geometric model does not always agree with an actual component in shape. If the position and orientation of the component are estimated by using the model fitting, a problem is caused in that an accurate estimation cannot be made in a case where a difference between the model and the actual component is large.
Patent Literature 1 discusses a method for absorbing a dispersion of shape of an object in recognizing the position of the object using the model. In this method, the position of the object is realized based on a standard model representing the object and an image (measurement data), and the measurement data is statistically processed to sequentially update the standard model.
Non Patent Literature 1 discusses a method for absorbing a difference between an individual and a three-dimensional model of a face in estimating the position and orientation of the face. In this method, a deviation of a feature point for each individual is acquired from the distribution of a previously acquired deviation and actual measurement data, and the feature point provided with the deviation is subjected to the model fitting. A model is produced for each individual to allow estimating the position and orientation independently of the dispersion of a shape.
In Patent Literature 1, the dispersion of shape of the object is absorbed to improve the recognition rate of the object. The method updates the standard model so as to absorb the dispersion of an actual object, so that the method is suited for recognizing the object and roughly estimating the position and orientation but not suited for accurately estimating the position and orientation.
The method discussed in Non Patent Literature 1 is the one for explicitly calculating a deviation and may contain errors in the acquired deviation itself due to a false detection of a feature from the measurement data, so that the method is not suited for accurately estimating the position and orientation.