Field of the Invention
The present disclosure generally relates to information processing and, more particularly, to an information processing apparatus, information processing method, storage medium, and to a technique for measuring a position and orientation of an object having a three-dimensional shape that is known.
Description of the Related Art
With the development of robotics in recent years, robots are now performing complicated tasks that have conventionally been performed by a human, such as assembly of industrial products. The robots use an end effector such as a hand to hold and assemble parts. The assembly necessitates measuring relative positions and orientations between the parts to be held and the robot (hand).
The position and orientation can be measured by a method using model fitting in which a three-dimensional model of an object is fitted to features detected from a two-dimensional image or to a range image. When the model fitting is performed on a two-dimensional image, the position and orientation of an object is estimated so that a projected image acquired by projection of a three-dimensional model of the object on the two-dimensional image based on the position and orientation of the object fits a detected feature. When the model fitting is performed on a range image, each of the points in the range image is converted into a three-dimensional point group having three-dimensional coordinates, and the position and orientation of the object is then estimated so that a three-dimensional model of the object fits the three-dimensional point group in a three-dimensional space.
However, the position of the feature detected in the two-dimensional image and the three-dimensional coordinates of the point group contain an error due to a quantization error in pixels, an error associated with a blur, accuracy of a feature detection algorithm, correspondence between cameras, and the like.
To overcome such an issue, efforts have been made to improve the accuracy of position and orientation measurement, for example that is averaging of the effect of measurement errors contained in a plurality of pieces of measurement information (features on an image or point group), and the like.
As a method for measuring the position and orientation with high accuracy, there is a method in which the position and orientation is estimated using gradients of an intensity image and a range image without explicit feature detection (Hiura, Yamaguchi, Sato, Inokuchi, “Real-Time Tracking of Free-Form Objects by Range and Intensity Image Fusion”, Denshi Joho Tsushin Gakkai Ronbunshi, D-II, vol. J80-DII, no. 11, pp. 2904-2911, 1997). In this method, based on the assumption that a brightness change and a range change are smoothly when an object moves, an orientation parameter of the object is calculated from the brightness change of the intensity image and the range change of the range image based on a gradient method. However, since the dimensions are different between the intensity image, which is a two-dimensional image, and the range image, which is a three-dimensional image, it has been difficult to effectively combine the two images. Thus, manual tuning has been required.
According to an exemplary embodiment of the present disclosure, the position and orientation of a target object is estimated using measurement information acquired from a two-dimensional image in combination with measurement information acquired from range data so that the position and orientation of the target object can be measured with high accuracy.