1. Field of the Invention
The present invention relates to a technology for directly measuring a position and an orientation of an image capturing device or an object on the basis of a measured value of an inclination sensor.
2. Description of the Related Art
In recent years, research on mixed reality (MR) technologies has been progressing. MR technologies refer to technologies for seamlessly integrating a physical space with a virtual space generated by a computer. Among MR technologies, an augmented reality (AR) technology for superimposing a virtual space on a physical space is attracting particular attention.
An image display device using AR technology is realized by a video see-through method or an optical see-through method. According to the video see-through method, a synthetic image is displayed which is formed by superimposing, on an image of a physical space captured by an image capturing device such as a video camera, an image of a virtual space (such as a virtual object drawn by computer graphics or text information) which is generated on the basis of a position or an orientation of the image capturing device. Meanwhile, according to the optical see-through method, the image of a virtual space generated on the basis of a position of a viewer's eye and an orientation of the viewer's head is displayed on an optical see-through display mounted to the head of the viewer.
Applications of AR technology are now anticipated in various fields such as surgery assistance for superimposing and displaying a condition of a patient's internal body on the body surface, architectural simulation for superimposing and displaying a virtual building on an empty construction site, and assembly assistance for superimposing and displaying an operational flow during the assembly work.
One of the most important issues in AR technology is how to perform accurate registration between the physical space and the virtual space, and up to now a large number of approaches have taken place. In the case of using the video see-through method, a problem of the registration in AR technology is in essence a problem of obtaining a position and an orientation of the image capturing device in the scene (more specifically, in the world coordinate system defined in the scene). Similarly, in the case of using the optical see-through method, the problem of the registration in AR technology leads to a problem of obtaining the position of the viewer's eye and the orientation of the viewer's head or the position and the orientation of the display in the scene.
As a method of solving the former problem, a plurality of indices are arranged or set in the scene, and the indices' projected positions on an image captured by the image capturing device are detected, whereby the position and the orientation of the image capturing device in the scene are obtained. As a method of solving the latter problem, the image capturing device is mounted to a measurement target (that is, the head of the viewer or the display), the position and the orientation of the image capturing device are obtained in the similar manner to the former method, and then a position and an orientation of the measurement target are obtained on the basis of the position and the orientation of the image capturing device.
Methods of obtaining the position and the orientation of the image capturing device on the basis of the correspondence between the projected positions of the indices on the captured image and the three-dimensional positions of the indices have been proposed in the field of photogrammetry for many years. In addition, the paper by Kato, M. Billinghurst, Asano, and Tachibana: “An Augmented Reality System and its Calibration based on Marker Tracking”, Journal of The Virtual Reality Society of Japan, Vol. 4, No. 4, pp. 607-616, (December 1999) describes a procedure in which the position and the orientation of the image capturing device, which are obtained on the basis of the above-mentioned projected positions of the indices on the captured image, are set as initial values. After that, iterative calculations are performed to minimize any positional error between actually-observed positions of the indices on the captured image and the calculated positions of the projected indices, namely, any positions calculated from the three-dimensional positions of the indices and the position and the orientation of the image capturing device, for optimizing the position and the orientation of the image capturing device.
Alternatively, instead of only using the image captured by the image capturing device, a procedure is performed such that a gyro sensor (to be exact, a three-axis orientation sensor structured by combining a plurality of gyro sensors for measuring angular-rates in three axis directions with a plurality of acceleration sensors in three axis directions; which is however herein referred to as gyro sensor for convenience) is mounted on the image capturing device to obtain the position and the orientation of the image capturing device while also using image information and gyro sensor information.
US Patent Application No. 2004-0090444 discloses a method of using a measured value of an orientation output from a gyro sensor as the orientation of an image capturing device and solving linear simultaneous equations on the basis of the orientation of the image capturing device and the correspondence between projected indices of indices on a captured image and three-dimensional positions of the indices to obtain a position of the image capturing device. Furthermore, the paper by K. Satoh, S. Uchiyama, and H. Yamamoto: “A head tracking method using bird's-eye view camera and gyroscope,” Proc. 3rd IEEE/ACM Int'l Symp. on Mixed and Augmented Reality (ISMAR 2004), pp. 202-211, 2004 suggests a procedure where an inclination component of the measured value output from a gyro sensor is used as an inclination component of an orientation of the image capturing device. After that, iterative calculation is performed for optimizing a position and an azimuth component of the orientation of the image capturing device to minimize any positional error between the calculated positions and the observed positions of the projected indices on the captured image described above.
According to the above-mentioned method of only using the projected indices of the indices on the image captured by the image capturing device, the position and the orientation of the image capturing device thus obtained may be inaccurate or may have jitter because a detection error of the indices on the captured image occurs or because the resolution of the image is finite.
Then, the gyro sensor involves the generation of a drift error. Therefore, as time elapses, an error of the measured value in the azimuth direction is gradually generated. For this reason, when the measured orientation value output from the gyro sensor is set as the orientation of the image capturing device as it is, an inaccurate position of the image capturing device may be obtained in some cases.
Moreover, in order to obtain the position and the orientation of the image capturing device or the position and the azimuth component of the orientation of the image capturing device through iterative calculation, an initial value is required. When a result of the previous frame is used for the initial value, if the position and the orientation of the image capturing device are not obtained in the previous frame (for example, in the case where indices are not captured in the previous frame), the initial value cannot be obtained. When the position and the orientation of the image capturing device calculated only from the captured image are used for the initial value, the initial value may be inaccurate from the beginning as described above. Thus, even when the iterative calculations are carried out thereafter, the optimization occasionally cannot be achieved.