1. Field of the Invention
The present invention relates to an apparatus and method for measuring the position and orientation of an object.
2. Description of the Related Art
Recently, research concerning mixed reality for the purpose of seamless linkage between a real space and a virtual space has been actively conducted. An image display apparatus for displaying mixed reality is realized by so-called “video see-through” in which a virtual space image (e.g., a virtual object drawn by computer graphics, text information, etc.), generated in response to the position and orientation of an imaging device such as a video camera, is drawn so as to be superimposed on a real space image captured by the imaging device, whereby a superimposition image is displayed.
In addition, the image display apparatus can also be realized by so-called “optical see-through” in which a virtual space image, generated in response to the position and orientation of an observer's viewpoint, is displayed on an optical see-through display mounted on the head of the observer.
New fields different from those in virtual reality of the related art, such as operation assistance in which an internal state is displayed in superimposed form on the surface of a patient's body, and a mixed reality game in which a player fights with virtual enemies floating in real space, are expected as applications of such image display apparatuses.
What is required in common for these applications is the accuracy with which the registration between the real space and the virtual space is performed. Many such attempts have been performed. In the case of video see-through, a problem of registration in a mixed reality is equivalent to as a problem of finding the position and orientation of the imaging device in a scene (i.e., in a world coordinate system). Similarly, in the case of the optical see-through, the problem of registration in mixed reality sense concludes as a problem of finding, in a scene, the position and orientation of the display or the observer's viewpoint.
As a method for solving the former problem, it is common to find the position and orientation of the imaging device in the scene by disposing or setting a plurality of indices in a scene and detecting the coordinates of projected images of the indices in an image captured by an imaging device. In addition, there are attempts to realize registration more stable than that in the case of only using image information by using inertial sensors mounted on the imaging device. More specifically, the position and orientation of the imaging device, estimated based on values measured by the inertial sensors, is used for index detection. The estimated position and orientation is also used as initial values for calculation of a position and orientation based on an image, or as a rough position and orientation even if no indices are found (e.g., Hirofumi FUJII, Masayuki KANBARA, Hidehiko IWASA, Haruo TAKEMURA, Naokazu YOKOYA, “Kakuchogenjitsu-notameno Jairosensa-wo Heiyoshita Sutereokamera-niyoru Ichiawase (registration with a stereo camera by jointly using a gyro sensor for augmented reality)”, Denshi Joho Tsushin Gakkai (Institute of Electronics, Information and Communication Engineers) Gijutsu Hokoku (Technical report) PRMU99-192 (Shingaku Giho (Technical Report of IEICE), vol. 99, no. 574, pp. 1-8)”.
As a method for solving the latter problem, it is common that, by mounting an imaging device (and inertial sensors) on a targeted object (i.e., the head of an observer or a display) to be measured, and finding the position and orientation of the imaging device in a manner similar to the former case, the position and orientation of the targeted object are found from relationships of known relative positions and orientations between the imaging device and the targeted object.
However, in the above methods of the related art, in a situation in which a subjective viewpoint image does not include image information sufficient for realizing stable registration, for example, when indices locally existing in a portion of an image are observed, and when only three indices are observed and index detection includes an error, accuracy and stability of an obtained solution may become insufficient. In addition, when the number of indices observed is not greater than two, no solution can be found. To avoid these problems, a large number of indices need to be evenly set in the scene. This causes problems in that it is difficult to identify indices and in that the real space image is deformed. In addition, there is a problem in that, in a situation in which images of indices on a subjective viewpoint image are covered with an observer's hand, registration is completely impossible.