1. Field of the Invention
The present invention relates to an information processing apparatus and information processing method and specifically to a technique for detecting and identifying indices located on a physical space from an image obtained by sensing the physical space.
2. Description of the Related Art
For example, a mixed reality (MR) system that combines and displays a physical space and virtual space requires position and orientation measurement of an image sensing unit (to be also referred to as a “camera” hereinafter) that senses an image of the physical space. Conventionally, as camera position and orientation measurement techniques which use indices (e.g., objects having specific shapes and/or colors) that are located in the physical space, the techniques disclosed in the following references are known.
D1: Kato, Billinghurst, Asano, and Tachibana: Augmented Reality System and its Calibration based on Marker Tracking, Transactions of the Virtual Reality Society of Japan vol. 4, no. 4, pp. 607-616, December 1999.
D2: X. Zhang, S. Fronz, and N. Navab: Visual marker detection and decoding in AR systems: A comparative study, Proc. of International Symposium on Mixed and Augmented Reality (ISMAR '02), pp. 97-106, 2002.
D3: Junichi Rekimoto: “Augmented Reality System using the 2D matrix code”, Interactive System & Software IV, Kindai kagaku sha, 1996.
D4: Japanese Patent Laid-Open No. 2000-82108
In references D1 to D3, a square index is located in advance at a known position in the physical space, and the indices are detected from an image obtained by sensing the physical space including the indices using a camera. Then, the position and orientation of the camera are measured (estimated) based on the coordinates in the image of the four vertices of the square index and their known absolute coordinates. However, since a square has rotation symmetry every 90° with respect to an axis that passes through its center (the intersection of diagonal lines) and is perpendicular to its plane as a rotation axis, the directionality of each index cannot be discriminated based only on the vertex coordinates in the image. For this reason, another feature (e.g., a directional pattern) is used to discriminate the directionality in the index.
Furthermore, when a plurality of indices are used, since they need to be identified based on only the image sensed by the camera, graphic information such as unique patterns, symbols, or the like, which are different for respective indices, is embedded in each index. Since such a square index has information for identification (to be referred to as identification information hereinafter) compared to an index which has only one barycentric point of a specific color area on an image as a feature, they hardly cause misidentification even if many square indices are located. However, since the quantity and precision of the identification information depend only on the sensed image, the index is likely to be misidentified due to quantization errors, camera noise, a shadow cast on the index, and the like.
The probability of misrecognition of the index will be described below using FIGS. 3A to 3C. FIG. 3A is a front view of an index 301. The index 301 has a planar form having a square outer shape. The index 301 is configured to locate a two-dimensional (2D) barcode part 201b as identification information inside a black frame 201a. Therefore, the index 301 is characterized in that identification information can be obtained by recognizing the 2D barcode part 201b from the image of the index 301.
Assume that this index 301 is sensed in a state in which an angle the visual axis of the camera makes with the direction of a normal 201c to the index 301 is close to 90°, as shown in FIG. 3B, and the 2D barcode part inside the index is recognized based on the sensed image. In this case, since the 2D barcode part 201b is distorted considerably, code recognition may fail due to factors such as quantization errors and the like. If recognition of the 2D barcode has failed, the index is misidentified.
When unexpected color information 302, such as a shadow or the like, appears as noise on the 2D barcode part 201b of the index 301, as shown in FIG. 3C, recognition of the 2D barcode fails, resulting in misidentifying the index.
There are two kinds of influences on this misidentification of the camera position and orientation measurement result. First, since the information cannot be recognized as a conventionally defined index, it is not used as information for calculating the position and orientation of the camera, resulting in measurement precision drop. Especially, in a situation in which only one index is sensed, the position and orientation of the camera cannot be obtained.
Second, the recognized identification information overlaps that of another index located at another place. In this case, since that index is misrecognized (misidentified) as that located at the other place, the position and orientation of the camera cannot be correctly obtained (the wrong position and orientation are obtained).
In references D3 and D4, in order to prevent such misidentification, the 2D barcode includes a code for error discrimination, and when the recognition result includes an error, the recognized information is not used as the index.
In addition to error detection of the identification information of the index, when an error correction code, such as a Hamming code or the like, is used as a 2D barcode of the index, an error can be detected, and the identification result can be corrected if the error is 1 bit. Error correction of a plurality of bits can be made from only image information if a 2D barcode adopts a Reed-Solomon code with an enhanced error correction function, such as a QR code standardized by JIS X 0510 (1999).
In case of an index which has a code such as a Hamming code using a relatively simple error correction function as identification information, if a predetermined number or errors are found upon code recognition, errors cannot often be detected and corrected. For example, in the case of an index which has a Hamming-coded 2D code as identification information, if errors of a plurality of bits have occurred, they cannot be correctly corrected. In this case, correct identification information cannot be obtained, and the index is identified using wrong identification information.
An index like a QR code which adopts a coding scheme with an enhanced error correction function can correct errors even when errors of a plurality of bits have occurred. However, compared to an index which uses a relatively simple error correction code, such as a Hamming code, the former index suffers the following problems:                recognition processing takes a long time since error correction operations are complicated;        the index must have a broad 2D barcode area since advanced coding is disabled unless the code must have a large number of bits; and so forth.        
In the method of calculating the position and orientation of the camera on the coordinate system having a four-corner feature which is detected by a camera image, as described in references D3 and D4, these problems bring about the following limitations:                real-time processing of the camera position and orientation calculation might be difficult; and        the size of the overall index required to calculate the camera position and orientation cannot be reduced.        
Upon calculating the camera position and orientation, more indices can be located with decreasing the size of the index, since location flexibility is high. Furthermore, a plurality of indices can be simultaneously sensed more easily than those which have a large size, and improvement of the camera position and orientation measurement precision can be expected. For these reasons, smaller indices are preferably used, and an index which has a code that enhances the error correction function by increasing the code size inside the index as identification information is not suited to an index used to calculate the camera position and orientation.