[Prior Art 1]
The position and orientation of an image sensing unit (to be referred to a camera as needed hereinafter) such as a camera for capturing a real space are required to be measured in a mixed reality system that merges and displays a real space and virtual space. As such prior arts, a method of correcting measurement errors of a position and orientation sensor which measures the position and orientation of a camera using a marker which is arranged on a real space and has a known position or a feature point whose position on the real space is known (the marker and feature point will be generally referred to as an index hereinafter) is available, as disclosed in Japanese Patent Laid-Open No. 11-084307, Japanese Patent Laid-Open No. 2000-041173, and A. State, G. Hirota, D. T. Chen, B. Garrette, and M. Livingston: Superior augmented reality registration by integrating landmark tracking and magnetic tracking, Proc. SIGGRAPH '96, pp. 429-438, July 1996. (reference 1).
In other words, the method of these prior arts estimates the position and orientation of a camera using the position and orientation sensor for measuring the position and orientation of the camera, and an index sensed by the camera. As an index used in such method, the center of gravity of a color region, concentric circle, and the like are known. A plurality of indices are normally used at the same time. As one of means for determining which of a plurality of indices arranged on the real space an index detected from an image sensed by the camera corresponds to, the relationship between the coordinate position of the index detected from the image, and that of the index on an image plane obtained by projecting the absolute value position of the index on the basis of the measurement value of the position and orientation sensor may be exploited.
[Prior Art 2]
On the other hand, a method of estimating the position and orientation of a camera using only an index sensed by the camera without using any position and orientation sensor is known, as disclosed in Kato, Billinghurst, Asano, and Tachibana: Augmented reality system and its calibration based on marker tracking, Journal of Virtual Reality Society of Japan, vol. 4, no. 4, pp. 607-616, Dec. 1999. (reference 2), X. Zhang, S. Fronz, and N. Navab: Visual marker detection and decoding in AR systems: A comparative study, Proc. of International Symposium on Mixed and Augmented Reality (ISMAR'02), 2002. (reference 3). In these non-patent references, a square index is used, and the position and orientation of the camera are estimated on the basis of the coordinate positions of four vertices of the square. Since a square has a rotational symmetry shape in increments of 90° to have, as an axis of rotation, an axis which passes through its central point (intersection of diagonal lines), and is perpendicular to its plane, up, down, right, or left cannot be determined based on the coordinate positions of the vertices alone. For this reason, another image feature is formed inside the square index to determine up, down, right, or left. Furthermore, when a plurality of indices are to be used, since which of the plurality of indices is currently captured must be identified based only on the image sensed by the camera, graphic information such as unique patterns, codes, or the like, which are different for respective indices, is embedded in each index.
In the method of estimating the position and orientation of the camera of prior art 1, when a point or concentric marker is used as an index, information that one index has is only one coordinate value. For this reason, a geometric information volume is small, and a method of simultaneously using a relatively large number of indices is adopted to attain accurate estimation of the position and orientation and to broaden the observation view.
As described above, when a plurality of indices are used at the same time, a method of identifying which of indices arranged on the real space an index captured by an image corresponds to must be devised. Especially, when image features (colors, shapes, or the like, which can be identified by an image process) of indices are identical or have a small difference, and a large number of indices are arranged, there is a possibility of identification errors.
The possibility of identification errors will be described in detail below using FIG. 5. Referring to FIG. 5, reference numeral 500 denotes an image sensing range (image region) of a camera; and 501 and 502, point markers which are arranged on the real space and are detected from a sensed image. Reference numerals 503 and 504 denote points which are obtained by projecting the absolute positions of two point markers sensed by the camera onto the image sensing plane of the camera using measurement values of a position and orientation sensor attached to the camera.
If the position and orientation sensor has no error, the points 501 and 503 and points 502 and 504 must respectively match on the image plane. However, in practice, due to the influence of errors of the position and orientation sensor, the coordinate positions of the points 503 and 504 on the image plane are calculated as those which deviate from the points 501 and 502. In case of this example, the coordinate position of the point 504 is projected at a position which falls outside the image region. An index is identified by comparing the positions of indices projected onto the image plane with those of indices detected from the sensed image, and determining a pair of indices having a small distance as an identical index. In this case, since the point 503 is closer to the point 502 than the point 501, it is determined that points 503 and 502 correspond to each other, and it is identified that the index “502” detected from the image is an index “503” arranged on the real space. As in this example, when a plurality of indices whose image features are identical or have a small difference are used like in prior art 1, identification errors may occur.
In the method of estimating the position and orientation of the camera of prior art 1, a small circular sheet-like object of a specific color can be used as an index. In this case, information that the index has includes a three-dimensional (3D) position (coordinates) and color. Using measurement values of the position and orientation sensor, the 3D position of the index is projected onto the image plane of the camera, while a color region detection process for detecting the color of that index from the image is executed to calculate a barycentric position in the image. The 3D position projected onto the image plane is compared with the barycentric position calculated from the image, and a pair of, e.g., closest indices are determined as an identical index, thus identifying the index in the image.
In this manner, when an index is detected from the image by color region detection, the real space to be sensed by the camera must not include the same color as that of an index other than that index.
For example, a case will be exemplified below wherein various objects are present on a space as a background, as shown in FIG. 13. Referring to FIG. 13, an index 1203 to be used is set on a real object, and has, e.g., a red circular region. A real space to be sensed by a camera 101 includes a ballpoint pen 1204 with a red cap together with the index. Assume that an image sensed by the camera 101 in this state is that shown in FIG. 14. FIG. 14 also illustrates a point 1302 obtained by projecting the 3D coordinate position of the index 1203 onto the image sensing plane of the camera 101 on the basis of the measurement values of a position and orientation sensor 102.
At this time, when an index 1301 is detected from the image using color region detection, as described above, not only the circular region of the index 1203 but also a red region 1303 of the ballpoint pen 1204 are more likely to be detected as a red region. If the region 1303 is detected as a red region, since the barycentric position of the red region 1303 of the ballpoint pen 1204 is closer to the projected position 1302 than that of the index 1301, the region 1303 is identified as an index corresponding to the projected position 1302. In this manner, if the space to be sensed includes an object having the same or similar color as or to that of an index, that object is erroneously recognized as an index.
In order to prevent such problem, a method of using an index as a combination of different colors which are arranged in a concentric pattern, checking a combination of colors after color region detection, and detecting only a region with a correct combination as an index is available. In this case, since a part of a background is unlikely to be detected as an index compared to a case wherein a monochrome index is used, no identification errors occur in case of FIG. 13.
However, in order to perform stable index detection using color region detection, index colors are often set to be striking colors. Furthermore, when different colors are to be combined in a concentric pattern, an index must be sensed with a sufficiently large size in the image so as to stably detect the concentric pattern. That is, a large, unattractive index must be placed on the real space. However, it is often not allowed to set such index on the real space, or such index nullifies the appearance of the real space.
On the other hand, the method of utilizing an index with a geometrical spread like a square index used in prior art 2 is available. However, with prior art 2, since each individual marker must be identified from only the image, code information unique to each marker, symbol information that can serve as a template, or the like must embedded so as to identify up, down, right, or left. FIGS. 7A to 7C show examples of practical square indices, which are used in prior art 2 disclosed in references 2 and 3.
Since an index with such complicated structure must be detected from the image, it cannot be recognized unless the index is captured to occupy a sufficiently large area in the sensed image. In other words, this means that a broad region on the real space must be assured to set an index, or the camera must sufficiently approach the index. Or a strict layout condition of indices is required.