1. Field of the Invention
The present invention relates to a technique for detecting and identifying an index located in physical space from an image obtained by sensing the physical space.
2. Description of the Related Art
[Prior Art 1]
For example, a mixed reality (MR) system that combines and displays physical and virtual space requires position and orientation measurement of an image sensing unit (to be also referred to as a camera hereinafter) that senses an image of physical space. Conventionally, upon measuring the position and orientation of the image sensing unit by a position and orientation sensor, there is a known technique for correcting the measurement result using indices (for example, objects having specific shapes and colors) which are located in advance in the physical space and whose positions are known (see Japanese Patent Laid-Open No. 11-084307, Japanese Patent Laid-Open No. 2000-041173, and A. State, G. Hirota, D. T. Chen, B. Garrett, and M. Livingston: Superior augmented reality registration by integrating landmark tracking and magnetic tracking, Proc. SIGGRAPH '96, pp. 429-438, July 1996).
In other words, these methods estimate the position and orientation of the camera using a position and orientation sensor that measures the position and orientation of the camera and an image of indices sensed by the camera. As indices used in such methods, the barycenter of a region of a specific color, a concentric circle, and the like are known. Since a plurality of indices are often used simultaneously, the correspondence between the indices detected from the image sensed by the camera (to be referred to as detected indices hereinafter) and the plurality of indices located on the physical space must be identified. As one of the conventional index identification methods, it is known to use the relationship between:                the estimated coordinates of an index on an image (image sensing) plane, which are obtained from a projection calculation based on the known absolute position of the index and the measurement value of the position and orientation sensor; and        the image coordinates of the index actually detected from the image.        
[Prior Art 2]
On the other hand, as disclosed in:    Kato, Billinghurst, Asano, and Tachibana: Augmented Reality System and its Calibration based on Marker Tracking, Transactions of the Virtual Reality Society of Japan vol. 4, no. 4, pp. 607-616, December 1999;    X. Zhang, S. Fronz, and N. Navab: Visual marker detection and decoding in AR systems: A comparative study, Proc. of International Symposium on Mixed and Augmented Reality (ISMAR'02), 2002; and    Junichi Rekimoto, “Matrix: A Realtime Object Identification and Registration Method for Augmented Reality”, Proc. of Asia Pacific Computer Human Interaction (APCHI '98), 1998,a method of estimating the position and orientation of the camera using only an image of indices sensed by the camera without using any position and orientation sensors is known. These references use a square index and measure (estimate) the position and orientation of the camera based on the coordinates of the four vertices of the square. However, since a square has rotational symmetry for every 90° with respect to an axis that passes through its center (the intersection of the diagonal lines) and is perpendicular to its plane as a rotation axis, the directionality of each index cannot be discriminated from only the vertex coordinates in the image. For this reason, another feature (e.g., a directional pattern) used to discriminate the directionality of the index is provided in the index. Furthermore, when a plurality of indices are used, since they need to be identified based only on the image sensed by the camera, graphic information such as unique patterns, symbols, or the like, which are different for respective indices, is embedded in each index.
[Prior Art 3]
When the camera position and orientation estimation method of prior art 1 uses a dot marker or concentric circle marker as an index, the information of one index is only one coordinate value. For this reason, the geometric information volume is small, and a method of simultaneously using a plurality of indices is adopted.
As described above, when a plurality of indices are used simultaneously, a method of identifying the correspondence between the sensed detected indices and those which are located in physical space must be devised. Especially, if the image features (features such as colors, shapes, and the like, which can be identified by image processing) of the indices are the same or have small differences, and a large number of indices exist, misidentification is likely to occur.
On the other hand, the square index used in prior art 2 must be embedded with code information unique to a marker or information which serves as a template so as to identify the upper, lower, left, and right directions. Since the index having such a complicated structure must be detected from the image, it cannot be identified as long as the index occupies a sufficiently large area in the sensed image plane.
In order to solve the aforementioned problems, such as misidentification due to the small information volume per index of prior art 1 and restrictions on the location conditions due to the complexity of the index of prior art 2, the following method may be adopted.
When indices of an identical type having directionality are located in physical space like in prior art 2, the directionality of each detected index on the image plane is compared with the estimated directionality of each index on the image plane, which is obtained by making a projection calculation using the position and orientation sensor. As a result, each index can be identified more simply and stably than in the prior art.
For example, as shown in FIG. 7, the directionality of an index 202A, which is obtained by a projection calculation using the position and orientation sensor, is compared with the directionalities of indices 201A and 201B on the image, which become candidates detected from the sensed image. In the example of FIG. 7, it is determined that the index 201A having the closer directionality corresponds to the index 202A. With this method, the index size can be reduced compared to that of prior art 2 and the restrictions on the location conditions can be relaxed even though an index having a larger information volume than prior art 1 is used.
The method of prior art 2 compares directionalities by projecting an index at a position on the image, which is calculated based on the value of the position and orientation sensor attached to the camera, and the absolute coordinates of the index, which are registered in advance. For this reason, there are restrictions as to the locations of a plurality of indices which may be sensed at the same time so that they may have different directionalities.
For example, when once index each is located on the floor and wall, as shown in FIG. 3, since indices 201A and 201B located on the floor and wall, respectively, have the same directionality in the sensed image, they may be misidentified depending on errors of the position and orientation sensor. For this reason, in order to avoid such locations, the directionality of either index must be changed.
That is, the method of prior art 3 has the limitation that when a plurality of indices are sensed close to each other, they must be located to allow sensing with different directionalities. In this case, even when indices are located on surfaces having different slopes in physical space as in the example of FIG. 3, a case wherein their directionalities are hard to distinguish on the sensed image must be taken into consideration.
Especially, when a large number of indices are located in a small region having a plurality of inclined surfaces (the floor and wall, or a solid body having a large number of inclined surfaces) densely, it is not easy to locate them in consideration of the directionalities upon image sensing. For this reason, the number of indices which can be located in a region where they are likely to be sensed at the same time is limited.
Assume that an index 201A is located on the wall, and printed matter 601 is located on the wall side, as shown in FIG. 5. In such a situation, a partial region of the printed matter 601 on the wall may be erroneously detected as a detected index 602 due to factors such as camera noise and the like. In the sensed image shown in FIG. 5, the index 201A is not detected as an index, since it is cut at the end of the sensed image. As a result, when the directionality of a projected image 202A of the index 201A calculated by projection is similar to that of the erroneously detected index 602 on the image, the detected index 602 is misidentified as the index 201A. In this way, conventionally, the detected index 602, which is erroneously detected on the inclined surface different from an actual index, is misidentified.