1. Field of the Invention
The present invention generally relates to an information processing apparatus and an information processing method thereof, particularly to a technique for measuring the position and orientation of an image capturing apparatus.
2. Description of the Related Art
Studies about mixed reality superimposing text and CG pictures on a physical space and presenting a result of superimposing have been extensively made. An image display apparatus that presents mixed reality can be implemented as an apparatus which superposes and renders, onto an image captured by an image capturing apparatus, an image generated according to the position and orientation of this image capturing apparatus, and displays.
In order to implement such technique, the relative position and orientation between a reference coordinate system defined on the physical space and a camera coordinate system need to be measured in real time. For example, a case will be examined below wherein a virtual object is superimposed at a predetermined position in a physical environment such as that in a room or on a table. In this case, a reference coordinate system is defined on an appropriate place (e.g., the floor surface of a room or the table surface) in that environment, and the position and orientation of a camera on the reference coordinate system can be measured.
In order to realize such measurement, it is a common practice to sequentially measure the position and orientation of a camera using time series images successively captured by the camera (see non-patent references 1, 2, and 3). For example, the position and orientation of the camera on the reference coordinate system can be calculated by the following sequence.
(1) A plurality of indices whose positions (reference coordinates) on the reference coordinate system are given are allocated or set on the floor, wall, table surface, or the like in a room.
(2) The image coordinates of the indices in a captured image captured by the camera are detected.
(3) The position and orientation of the camera are calculated based on the correspondence between the detected image coordinates of the indices and their reference coordinates.
Note that indices may be artificial markers which are intentionally set for the purpose of measurements, or may be features (natural features) or the like which originally exist on that environment.
In order to calculate the position and orientation of the camera by the above sequence, the allocation information of each index need to be acquired in advance as a preparation process. Note that the allocation information of each index represents the position of that index on the reference coordinate system. A process for calculating the allocation information of an index will be referred to as calibration of that index as needed hereinafter.
Normally, the calibration of the index is manually made using a surveying instrument, protractor, or the like. Also, a method of estimating the allocation information of an index based on a plurality of images captured in advance by the camera is used (see non-patent reference 4). In addition, a method of calibrating an index while sequentially executing the measurement of the position and orientation of the camera has been proposed (see non-patent reference 5).
An apparatus disclosed in non-patent reference 5 has an auto mapping mode for measuring the position and orientation of the camera while calibrating an unknown index, and a tracking mode for measuring only the position and orientation of the camera while the allocation information of each index is given. The user selectively uses these modes as needed.    [Non-patent Reference 1] Sato, Uchiyama, and Yamamoto: “UG+B: A Registration Framework Using User's View, Gyroscope, and Bird's-Eye View”, Transactions of the Virtual Reality Society of Japan, vol. 10, no. 3, pp. 391-400, 2005.    [Non-patent Reference 2] Sato, Uchiyama, and Tamura: “Registration method in augmented reality”, Transactions of the Virtual Reality Society of Japan, vol. 8, no. 2, pp. 171-180, 2003.    [Non-patent Reference 3] M. A. Fischler and R. C. Bolles (June 1981): Random Sample Consensus: A paradigm for model fitting with applications to image analysis and automated cartography, Comm. of the ACM, vol. 24, no. 6, pp. 381-395, 1981.    [Non-patent Reference 4] Kotake, Uchiyama, and Yamamoto: “A Marker Calibration Method Utilizing A Priori Knowledge on Marker Arrangement”, Transactions of the Virtual Reality Society of Japan, vol. 10, no. 3, pp. 401-410, 2005.    [Non-patent Reference 5] Leonid Naimark and Eric Foxlin Circular data matrix fiducial system and robust image processing for a wearable vision-inertial self-tracker, Proc. 1st International Symposium on Mixed and Augmented Reality (ISMAR 2002), pp. 27-36, 2002.    [Non-patent Reference 6] R. Hartley and A. Zisserman, Multiple view geometry in computer vision: Second Edition, Cambridge University Press.
Upon measuring the allocation information of an index in advance, it is required that the index has never moved at the current timing since its calibration timing. For example, if only markers adhered on a stationary structural object such as a wall, floor, ceiling, or the like are used, this condition is basically satisfied.
However, when indices whose allocations are likely to change are included, if the allocation information measured in advance is used intact, the position and orientation of the camera cannot be accurately measured when the allocation of such index has changed. For example, when markers are allocated on a movable object such as doors of a wall cabinet, a desk with wheels, or the like, the user need to visually confirm if a situation at the calibration timing of the index has not changed at the current timing.
On the other hand, when the method of sequentially updating the allocation information of each index simultaneously with the position and orientation measurement of the camera (disclosed in non-patent reference 5) is used, a change in allocation information of each index can be coped with in principle. However, since all pieces of allocation information of all indices which are not handled as fixed indices within a visual field are always kept updated, the computation load is heavy. Therefore, the apparatus disclosed in non-patent reference 5 uses the auto mapping mode only when the user decides necessity of that mode. As a result, the user need to make a decision about the necessity or non-necessity of re-calibration of indices.
That is, the related arts have no means of knowing whether or not re-calibration of indices should be executed. For this reason, the user need to decide whether or not to execute re-calibration based on the self empirical knowledge. Although only some indices are moved in most of actual scenes, no means for calibrating only indices that require re-calibration is prepared.