1. Field of the Invention
The present invention relates to a method and apparatus for measuring the position and orientation of an object and, more particularly, those of an image capture device.
2. Description of the Related Art
Measurement of the position and orientation of an image capture unit (to be referred to as a camera hereinafter as needed) such as a camera or the like used to capture an image of a physical space is required in, e.g., a mixed reality system that integrally presents the physical space and a virtual space.
[Related Art 1]
Japanese Patent Laid-Open No. 11-084307, Japanese Patent Laid-Open No. 2000-041173, and A. State, G. Hirota, D. T. Chen, B. Garrett, and M. Livingston: Superior augmented reality registration by integrating landmark tracking and magnetic tracking, Proc. SIGGRAPH '96, pp. 429-438, July 1996. disclose schemes for measuring the position and orientation of a camera using a position and orientation sensor. These related arts also disclose schemes for correcting measurement errors of the position and orientation sensor using markers whose allocation positions on the physical space are known or feature points whose positions on the physical space are known (markers and feature points will be collectively referred to as indices hereinafter).
[Related Art 2]
As disclosed in Kato, Billinghurst, Asano, and Tachibana: Augmented Reality System and its Calibration based on Marker Tracking, Journal of the Virtual Reality Society of Japan, vol. 4, no. 4, pp. 607-616, December 1999., and X. Zhang, S. Fronz, and N. Navab: Visual marker detection and decoding in AR systems: A comparative study, Proc. of International Symposium on Mixed and Augmented Reality (ISMAR'02), 2002., a method for estimating the position and orientation of a camera based on information of markers captured by a camera without using any position and orientation sensor is known. In these references, square indices are used, and the position and orientation of the camera are estimated based on the coordinates of four vertices of each square. However, since a square is rotation symmetrical every 90° with respect to an axis that passes through its central point (the intersection of diagonal lines) and is perpendicular to its plane as a rotation axis, the directionality (top, bottom, right, and left) cannot be determined from only the vertex coordinates in the image. For this reason, a graphic image having directionality or the like is drawn inside a square index to determine the top, bottom, right, and left from an image obtained by capturing the index. Furthermore, when a plurality of indices are used, since they need be identified based on only the image captured by the camera, graphic information such as unique patterns, symbols, or the like, which are different for respective indices, is drawn inside each index.
[Related Art 3]
As disclosed in US 2004/176925 A1 which is to be incorporated in the present specification by referring to its description contents, a method of enhancing the estimation precision and stability of the position and orientation of a camera by capturing indices using a stereo camera so as to increase the total number of indices to be detected compared to a case using only one camera is known. In US 2004/176925 A1, respective cameras which form the stereo camera have the same resolution and angle of view. The optical axes of the respective cameras agree with nearly the same direction.
[Related Art 4]
Japanese Patent Laid-Open No. 2004-205711 discloses a method of estimating the position and orientation of a camera in a system using a plurality of cameras having different angles of view. In Japanese Patent Laid-Open No. 2004-205711, a camera having one angle of view is used to estimate the position and orientation, and another camera having another angle of view is used to capture an image of an external world. Images captured by the cameras of the respective angles of view are composited and displayed on a head-mounted display.
[Related Art 5]
On the other hand, Shigezumi Kuwajima: 2Way Stereo System miniBEE, the Journal of Virtual Reality Society of Japan, vol. 10, no. 3, pp. 50-51, September 2005. discloses an apparatus which performs three-dimensional (3D) measurement of an object by mutually complementing the measurement results of two different 2-lens stereo cameras for short and long distances. In the apparatus described in this reference, the baseline lengths and focal lengths are selected in correspondence with the short and long distances, and objects at the short and long distances can be simultaneously measured.
In the method of Related Art 1, a small circular, sheet-like object with a specific color can be allocated as an index. In this case, the index has information including a 3D position (coordinates) and color. The 3D position of the index is projected onto the image plane of the camera using the measurement values of the position and orientation sensor, while color region detection processing for detecting the color of the index from an image is executed, thus calculating the barycentric position of the color region in the image. The 3D position of the index projected onto the image lane is compared with the barycentric position of the color region calculated from the image, and an identical index is determined when these positions are close to each other, thereby identifying the index in the image.
In Related Art 1, the camera whose position and orientation are to be measured is a device for capturing an image to be presented to the observer. Therefore, the resolution, angle of view, orientation, and the like of the camera cannot be freely changed to suit the detection of an index. That is, the measurement precision of the position and orientation of the camera is determined depending on the spatial resolution of the camera which is not always optimal to the detection of the index.
On the other hand, the method of measuring the position and orientation of a camera using an index like a square marker, which is used in Related Art 2 and an image of which gives many kinds of information such as vertices, drawn pattern, and the like can be detected is available. In Related Art 2, since each individual marker need be identified from the image alone without using any position and orientation sensor, the index must include code information, symbol information which can serve as a template, and the like.
FIGS. 6A to 6C show examples of practical square markers used in the system described in the reference presented as an example of Related Art 2.
Since an index having such a complicated feature must be detected from a captured image, it cannot often be recognized unless it is captured to occupy a sufficiently large area in the captured image frame. In other words, either a broad region on the physical space must be assured to allocate the index, or the camera must sufficiently come near the index. That is, the index allocation conditions are strict.
In Related Art 2 as well, the camera whose position and orientation are to be measured is a device used to capture an image to be presented to the observer, as in Related Art 1. Therefore, the measurement precision of the position and orientation of the camera is determined depending on the spatial resolution of the camera which is not always optimal to the detection of the index, as in Related Art 1.
In Related Art 3 as well, the cameras which form the stereo camera are devices used to capture an image to be presented to the observer. Therefore, the measurement precision of the position and orientation of the camera and the detectable index size are determined depending on the spatial resolution of the camera which is not always optimal to the detection of the index, as in Related Arts 1 and 2.
In Related Art 4, upon calculating the position and orientation of the camera, the plurality of cameras having different angles of view are not used simultaneously, but the camera of one angle of view is used. Therefore, the estimation precision of the position and orientation of the camera is lower than that using a plurality of cameras.
On the other hand, in Related Art 5, the plurality of cameras having different angles of view are used at the same time, but this technique is limited to a method that performs stereo measurement using the cameras of the same angle of view and then combines a plurality of stereo measurement results. Upon applying the scheme of Related Art 5, four or more cameras are required.