1. Field of the Invention
The present invention relates to a technique for improving the precision and stability of viewpoint position and orientation measurement.
2. Description of the Related Art
In recent years, studies about mixed reality (MR) that aims at seamless merging of physical and virtual spaces have been extensively made. An MR image is generated by superimposing and rendering virtual space images generated according to the position and orientation of an image sensing device such as a video camera or the like on a physical space image sensed by the image sensing device. An image display apparatus used in an MR system is implemented by, e.g., a video-see-through system. Note that the virtual space images include a virtual object rendered by computer graphics, text information, and the like.
In order to implement the MR, registration accuracy between the physical space and virtual space is important, and many approaches have been tested conventionally. A problem about registration in the MR is reduced to a problem of calculating the relative position and orientation between a target object on which virtual information is to be superimposed, and the image sensing device (to be referred to as the position and orientation of the image sensing device hereinafter).
As a method of solving this problem, the following attempt has been made (see Sato and Tamura: “A Review of Registration Techniques in Mixed Reality”, Meeting on Image Recognition and Understanding (MIRU2002) Transactions I, IPSJ Symposium Series, vol. 2002, no. 11, pp. I.61-I.68, July 2002). That is, a plurality of indices whose allocations on a target coordinate system are known are placed or set in an environment or on a target object. Then, the position and orientation of the image sensing device with respect to the target coordinate system are calculated using the three-dimensional (3D) coordinates on the target coordinate system of the indices as known information, and the coordinates of projected images of the indices in an image sensed by the image sensing device.
Also, an attempt that attaches an inertial sensor on an image sensing device and uses the sensor measurement value to achieve more stable registration than a case using only image information has been made. For example, a method that uses the position and orientation of the image sensing device estimated based on the sensor measurement value in index detection processing has been proposed. Also, a method that uses the estimation results as initial values for the position and orientation calculation based on an image has been proposed. Furthermore, a method that uses the estimation results as a rough position and orientation even in a situation in which indices are not observed has been proposed (see Japanese Patent Laid-Open No. 2005-33319, and Hirofumi Fujii, Masayuki Kanbara, Hidehiko Iwasa, Haruo Takemura, and Naokazu Yokoya, “A Registration Method Using Stereo Cameras with an Inertial Sensor for Augmented Reality”, Technical report of IEICE PRMU99-192 (Technical Report of IEICE, vol. 99, no. 574, pp. 1-8)).
The conventional registration technique using image information is premised on that all index detection results are correct. Furthermore, all index detection results are calculated using uniform weights. For this reason, correct position and orientation measurement often fails due to a large influence of indices as detection errors or those with low detection precision. Hence, the following technique has been proposed in recent years. That is, a statistical estimation method such as M estimation is adopted to calculate errors (re-projection errors) between the observation coordinates of the detected indices (feature points) on an image and the image coordinates (re-projected coordinates) of indices estimated from the position and orientation of the image sensing device and the positions of indices. Then, the reliabilities of the detected indices are calculated based on the errors to eliminate erroneously detected indices or to reduce their influences (see Sato, Kanbara, Yokoya, and Takemura, “Camera Movement Parameter Estimation from a Long Image Sequence by Tracking Markers and Natural Feature Points”, Transactions of IEICE, D-II, vol. J86-D-II, no. 10, pp. 1431-1440, 2003).
However, the technique which calculates the reliabilities based on the re-projection errors, and weights the indices to eliminate detection errors has a scheme for determining erroneously detected indices by statistically calculating exceptional values. For this reason, the technique assumes that detection errors of indices detected on an image apply to an error model such as a Gaussian distribution or the like, and is valid only when there are many indices that are likely to be detected correctly under that assumption. Therefore, when there are a small number of indices, the above technique is readily influenced by detection errors, and is not satisfactory as a detection error elimination technique. In a situation in which the index detection result changes little by little depending on an illumination condition and the like (an index which was detected in a given frame is not detected in the next frame), a mismatch of the position and orientation measurement results between frames occurs, thus disturbing stable position and orientation measurement.