1. Field of the Invention
The present invention relates to a correction technique of the position and orientation of an image sensing device.
2. Description of the Related Art
In recent years, studies about mixed reality (MR) that aims at seamless merging of a physical space and virtual space have been extensively made. An image display apparatus which presents MR is implemented by an apparatus having the following arrangement. That is, an apparatus displays an image by superimposing and rendering an image of the virtual space (for example, a virtual object rendered by computer graphics and text information) generated according to the position and orientation of an image sensing device such as a video camera on an image of the physical space sensed by the image sensing device.
As an application of such image display apparatus, a navigation that superimposes names and guides of well-known buildings and the like included in an image obtained by sensing an urban area is expected. Furthermore, a scenery simulation that superimposes a computer graphics video image of a building to be planned to construct on a planned construction site of that building is also expected.
Common requirements to these applications are how precisely to attain registration between the physical and virtual spaces, and many approaches have been conventionally made. In order to precisely attain registration between the physical and virtual spaces, camera parameters (intrinsic and extrinsic parameters) required to generate an image of the virtual space need only always match those of an image sensing device. If the intrinsic parameters of the image sensing device are given, a problem of registration in MR results in that of calculating the extrinsic parameters of the image sensing device, i.e., the position and orientation of the image sensing device on a world coordinate system set on the physical space.
As a method of calculating the position and orientation of the image sensing device on the world coordinate system set on the physical space, for example, non-patent reference 1 has made the following proposal. That is, non-patent reference 1 has proposed that the position and orientation of the image sensing device are obtained by combining the orientation measurement of the image sensing device using an orientation sensor and the position measurement of the image sensing device by means of a global positioning system or the like.
As typical orientation sensors used in such method, TISS-5-40 available from TOKIMEC, INC., and InertiaCube2 available from InterSense, Inc., U.S.A., are available. Each of these orientation sensors mainly comprises a gyro sensor which detects an angular acceleration in the three-axis directions, and an acceleration sensor for detecting an acceleration in the three-axis directions, and measures the orientation (azimuth, pitch angle, and roll angle) of the three axes by combining their measured values. In general, angle information obtained by the gyro sensor alone is only a relative change in orientation with respect to an orientation at a certain time. However, since these orientation sensors measure the direction of gravitational force of the earth using the acceleration sensor, absolute angles with reference to the direction of gravitational force can be obtained in association with tilt angles (i.e., pitch and roll angles).
The orientation measured value output from the orientation sensor represents the orientation of the sensor itself on a sensor coordinate system defined by the sensor itself independently of the world coordinate system. For example, in case of TISS-5-40 above, the sensor coordinate system is defined to have the direction of gravitational force (downward direction) as a Z-axis, and the frontal direction of the sensor at the time of sensor initialization on an X-Y plane specified by this Z-axis as an X-axis. In case of InertiaCube2, the sensor coordinate system is defined to have the direction of gravitational force (downward direction) as a Z-axis, and the north direction indicated by an internal geomagnetic sensor at the time of sensor initialization on an X-Y plane specified by this Z-axis as an X-axis. In this way, it is a common practice that the orientation measured value by the orientation sensor is not the orientation itself of an object to be measured (the image sensing device in case of the image display apparatus that presents MR) on the world coordinate system as information to be acquired.
That is, the orientation measured value by the orientation sensor cannot be used intact as the orientation of the object to be measured on the world coordinate system, and some coordinate transform is required. More specifically, a coordinate transform (Local Transform) that transforms the orientation of the sensor itself into that of the object to be measured, and a coordinate transform (World Transform) that transforms the orientation on the sensor coordinate system into that on the world coordinate system are needed.
The World Transform is a transform defined by the orientation of the sensor coordinate system with respect to the world coordinate system.
As described above, the sensor coordinate system is determined according to the direction of gravitational force. Therefore, a direction, on the world coordinate system, of the gravity axis of the sensor coordinate system (the Z-axis in case of TISS-5-40 and InertiaCube2) can be uniquely determined based on the definitions of the direction of gravitational force on the sensor coordinate system and world coordinate system. Using this information, the World Transform can be calculated while leaving indefiniteness in a rotation angle about the gravity axis. More specifically, a three-dimensional (3D) vector 1 that represents the vertically upward direction on the world coordinate system, and a 3D vector g that represents the vertically upward direction on the sensor coordinate system are prepared, and an angle β the two vectors make is calculated based on the inner product g·1 of g and 1. Furthermore, a normal vector n=g×1 to a plane defined by the two vectors is calculated based on the outer product of g and 1. Upon calculating a rotation matrix R*WC that implements a coordinate transform to have the vector n as a rotation axis and the angle β as a rotation angle, this becomes the orientation of an axis corresponding to the direction of gravitational force with respect to the world coordinate system. It is known that this calculation method can be implemented by a known method (see patent reference 1). Hence, only the rotation angle about the gravity axis of the World Transform is unknown.
This unknown parameter is geometrically equivalent to an “azimuth drift error” as an accumulated error of the orientation measured values about the gravity axis, which is generated as an elapse of time when the orientation sensor is a gyro sensor. Thus, if the rotation angle about the gravity axis left as an unknown value is interpreted as an “initial value of the azimuth drift error”, this parameter can be considered as a part of the azimuth drift error of the sensor, which changes dynamically. Accordingly, the World Transform can be considered as a given value (which can be derived based on the relationship of the gravity axis). Also, it is known that the automatic measurement (automatic correction) of the azimuth drift error can be implemented by a known method using image information if the Local Transform is given (see patent reference 2).
A method of calculating the Local Transform to be used as a given value is disclosed in patent reference 3. With this method, a plurality of indices whose layout positions on the world coordinate system are given are placed or set on the physical space or a target object. Then, the Local Transform is calculated using the 3D coordinates of these indices as given information on the world coordinate system, the coordinates of projected images of the indices in an image sensed by an image sensing device, and the output information of the orientation sensor at that time.    [Patent Reference 1] Japanese Patent Laid-Open No. 2005-107248    [Patent Reference 2] Japanese Patent Laid-Open No. 2003-203252    [Patent Reference 3] Japanese Patent Laid-Open No. 2005-326275
[Non-patent Reference 1] T. Hollerer, S. Feiner, and J. Pavlik, Situated documentaries: embedding multimedia presentations in the real world, Proc. International Symposium on Wearable Computers '99, pp. 79-86, 1999.
However, the method disclosed in patent reference 3 is not always effective when the orientation information at the time of use of the image sensing device that mounts the orientation sensor is taken into consideration. This is because since the method disclosed in patent reference 3 equally handles image information sensed by the image sensing device and the output information of the orientation sensor at that time so as to calculate them at equal weights, the calibration result contains nearly equal estimation errors with respect to every orientation. The above method is effective when the image sensing device that mounts the orientation sensor equally takes every orientation. However, when the frequencies of orientations that the image sensing device takes at the time of use have differences, the obtained calibration result contains nearly equal errors independently of the orientations of the image sensing device.