1. Field of the Invention
The present invention relates to a method and apparatus for measuring the position and orientation of an imaging apparatus or an object.
2. Description of the Related Art
Augmented Reality (AR) is a technology that realizes a combined display including information of a virtual space superimposed on a real space (physical space). A video see-through head-mounted display (HMD) is a representative apparatus capable of presenting AR-based information to a user. The video see-through HMD has a built-in camera that captures an image of the real space. The video see-through HMD can render an image of a virtual object based on computer graphics techniques by referring to position/orientation information of the camera in the real space. The video see-through HMD displays a composite image of a rendered virtual object superimposed on an image of the real space on its display device (e.g., a liquid crystal panel). Thus, this type of information presenting apparatus enables a user to feel as if a virtual object is actually present in the real space.
To realize AR technology successfully, “positioning” is a key technique. The “positioning” in AR technology is generally referred to as accurately maintaining the geometric consistency between a virtual object and the real space. If the “positioning” is sufficiently accurate and a user can view a virtual object constantly displayed at a correct position in the real space, a user can feel as if the virtual object is actually present in the real space.
In a system using a video see-through HMD, the “positioning” is realized by accurately measuring the position/orientation of a built-in camera in a coordinate system set in the real space. For example, a physical sensor (a magnetic sensor, an ultrasonic sensor, etc.) can be used to accurately measure the position/orientation of a camera having six degrees of freedom.
A system using a video see-through HMD can use image information from a built-in camera to measure the position/orientation of the camera. A measuring method using image information from a camera is simple and inexpensive compared to other methods using information from physical sensors.
As discussed in T. Drummond and R. Cipolla, “Real-time visual tracking of complex structures”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, NO. 7, pp. 932-946, 2002 (hereinafter, referred to as “literature 1”), a three-dimensional geometric model composed of line segments representing a real space or a real object and edge information obtained from an image captured by a camera can be used to measure the position/orientation of the camera.
Any point on an image can be regarded as an “edge” if the density suddenly changes at this point. The method discussed in literature 1 includes calculating the position/orientation of a camera so that numerous edges detected on an image accord with line segments of a three-dimensional geometric model projected on the image based on the position/orientation of the camera.
More specifically, the method includes virtually projecting line segments of a three-dimensional geometric model on an image based on predicted camera position/orientation data that are input beforehand, and performing edge detection in the vicinity of the projected line segments. Furthermore, the method includes calculating the position/orientation of the camera by repetitively correcting the predicted camera position/orientation data so that detected edges are present on the projected line segments.
The above-described camera position/orientation measuring method based on edge detection uses camera position/orientation information obtained from a previous frame as predicted camera position/orientation data. Therefore, if the position/orientation measurement has failed in a frame, the position/orientation measurement in the subsequent frame cannot be performed accurately. For example, failure in the measurement may occur when the camera moves at high speed, because of generation of motion blur in the image or when edges are not present in the vicinity of the line segments projected based on predicted camera position/orientation data. Furthermore, camera position/orientation information from a previous frame is unavailable when the position/orientation measurement is performed for the first frame.
Therefore, if the measurement is performed for the first frame or performed immediately after failure, “initialization processing” for measuring the position/orientation of the camera without using predicted camera position/orientation data is required to actually use a camera position/orientation measurement based on edge detection.
To this end, a method may include setting predetermined initial position/orientation and moving a camera to the initial position/orientation beforehand, and initializing the position/orientation of the camera based on edge detection while setting the initial position/orientation information as predicted values.
There is another method for measuring the position/orientation of a camera based on features detected from an image without using predicted camera position/orientation data. The present invention is directed to a method for performing camera position/orientation measurement without using predicted camera position/orientation data. For example, as discussed in Y. Liu, T. S. Huang, and O. d. Faugeras, “Determination of camera location from 2-D to 3-D line and point correspondences”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, No. 1, pp. 28-37, 1990. (hereinafter, referred to as “literature 2”), there is a method for performing camera position/orientation measurement according to correspondences between straight lines detected on an image and corresponding straight lines in a three-dimensional space.
The method discussed in literature 2 includes calculating the position/orientation of a camera by solving a linear equation set based on correspondences of at least eight straight lines. However, the method discussed in literature 2 does not mention a method for correlating straight lines detected on an image with corresponding straight lines in a three-dimensional space.
Therefore, to measure the position/orientation of a camera using the method discussed in literature 2 in a state where correspondences between straight lines on an image and straight lines in a three-dimensional space are unknown, it is necessary beforehand to obtain correspondences between straight lines detected on an image and straight lines in a three-dimensional space. In this case, a general method includes calculating the position/orientation of a camera by correlating straight lines on an image with straight lines in a three-dimensional space at random and outputting a calculation result obtained by using the correspondence having the highest consistency as a final camera position/orientation.
As discussed in Japanese Patent Application Laid-Open No. 2006-292417, to stabilize the measurement, an inclination sensor can be attached to a camera and the position/orientation of the camera can be measured based on a measurement value obtained by the inclination sensor and image information (information of points).
As discussed in Japanese Patent Application Laid-Open No. 2004-108836, an inclination sensor can be attached to a camera and an azimuth angle of the camera can be calculated based on a measurement value obtained by the inclination sensor and straight line information obtained from an image.
According to the camera position/orientation measuring method discussed in literature 2, the position/orientation calculation may not be stably performed if a straight line is erroneously detected or due to poor image resolution. On the other hand, the method discussed in Japanese Patent Application Laid-Open No. 2006-292417 is a method capable of stabilizing the position/orientation measurement. However, this method uses correspondences between points on an image and points in a three-dimensional space and, therefore, cannot be used to detect correspondences between straight lines.
If three-dimensional coordinates of both ends of a line segment detected from an image are known beforehand, a line segment can be regarded as two points, so that the method discussed in Japanese Patent Application Laid-Open No. 2006-292417 can be used.
However, it is generally difficult to accurately detect end points of a line segment. Therefore, the method discussed in Japanese Patent Application Laid-Open No. 2006-292417 is unfeasible. On the other hand, the method discussed in Japanese Patent Application Laid-Open No. 2004-108836 calculates only the azimuth angle of a camera from one image and cannot calculate the position of the camera.
The method applicable to a case where correspondences between straight lines on an image and straight lines in a three-dimensional space are unknown includes correlating the straight lines at random, obtaining a plurality of position/orientation data, and selecting optimum position/orientation values highest in consistency. The method, however, requires a great amount of calculations so that a calculation result can surely include correct position/orientation values.
Furthermore, the method requires consistency evaluation for each of the calculated position/orientation values. The method requires a long time to obtain a final solution. If the method is used for initializing the camera position/orientation measurement based on edge detection, the initialization requires a long time and a user may be kept waiting for a long time each time the position/orientation measurement fails. Thus, usability may be significantly impaired.