In recent years, systems that display images according to an Augmented Reality (AR) technique have become widespread (see, for example, patent document 1 and non-patent document 1). In an exemplary AR technique, an image of an object is captured using a camera installed in, for example, a Personal Computer (PC) or a portable terminal apparatus, and the position and posture of the camera within a three-dimensional space are estimated from the image of the object. Content information is superimposition-displayed at an arbitrary position within the image with reference to the determined camera position and posture.
FIG. 1 illustrates an exemplary AR technique for supporting a user inspection. A user 101 shoots an image of an inspection target 103 and a marker 104 using a camera installed in a portable terminal apparatus 102 such as a tablet.
The portable terminal apparatus 102 displays the captured image on a screen 105, and reads identification information indicated by the marker 104 so as to estimate a camera position and posture within a three-dimensional space. With reference to the estimated camera position and posture, the portable terminal apparatus 102 superimposition-displays content information 106 that corresponds to the marker 104 using, for example, Computer Graphics (CG). Content information 106 indicates procedures for the inspection of the inspection target 103, and the user 101 can perform the inspection efficiently by referring to content information 106.
Another method has also been proposed for determining a camera position and posture using characteristic factors (feature points) included in a captured image without providing a marker on a subject (see, for example, patent document 2). For example, a feature point may be detected from an image by using criteria for uniquely defining the position of a focused-on point within the image according to a grayscale fluctuation when the grayscale fluctuation is large at positions close to the focused-on point.
FIG. 2 illustrates an exemplary method of determining a camera position and posture using feature points. In this method, a three-dimensional map 201 is generated in advance that represents a set of three-dimensional coordinates of map points 211 on an object.
When an image 202 has been captured, the map points 211 are projected onto the image 202 using a transformation matrix M that transforms a three-dimensional coordinate system 203 into a camera coordinate system 204, thereby determining projection points 212. A camera position and posture within the three-dimensional coordinate system 203 are estimated by correlating the projection points 212 to feature points 213 detected from the image 202. For example, the camera position may be represented by the relative position of the camera coordinate system 204 relative to the three-dimensional coordinate system 203, and the camera posture may be represented by the relative angle of the camera coordinate system 204 relative to the three-dimensional coordinate system 203.
The following formulae may express Sp, the three-dimensional coordinates of a map point p; xp′, the two-dimensional coordinates of a projection point that corresponds to the map point p; and xp, the two-dimensional coordinates of a feature point that corresponds to the map point p.Sp═(x,y,z)  (1)xp′=(u′,v′)  (2)xp=(u,v)  (3)
In this case, the following formula expresses E, the sum of squares of the distance between a projection point and a feature point on the image.
                    E        =                              ∑            p                    ⁢                      |                                          xp                ′                            -              xp                        ⁢                          |              2                                                          (        4        )            
The camera position and posture are determined by determining a transformation matrix M that minimizes the sum of squares E in formula (4).
FIG. 3 illustrates an exemplary generation method for a three-dimensional map 201. This generation method relies on stereographic photographing and stereographic analysis. Images 301 and 302 that have been respectively shot at image-shooting positions 311 and 312 are used as key frames, and a feature point 313 within the image 301 and a feature point 314 within the image 302 are correlated to each other so as to recover a map point 315 within a three-dimensional space. Correlating a plurality of feature points within one of the two images to those within the other image recovers a plurality of map points, and a three-dimensional map 201 is generated that represents a set of the map points.
As an application of the AR technique, a technique is known for superimposition-displaying, on an image, Computer-Aided Design (CAD) data representing a three-dimensional shape of an object (see, for example, patent documents 3 and 4). CAD data may be superimposition-displayed in an image of a product produced according to the CAD data in a manner such that the CAD data overlaps the product, thereby allowing the formation accuracy of the product or defects in the product to be checked. In this case, a three-dimensional map does not need to be generated because the CAD data includes information on the three-dimensional map.    Polyhedron representation for computer vision is also known (see, for example, non-patent document 2).    Patent document 1: Japanese Laid-open Patent Publication No. 2015-118641    Patent document 2: International Publication Pamphlet No. WO2014/179349    Patent document 3: International Publication Pamphlet No. WO2012/173141    Patent document 4: Japanese Laid-open Patent Publication No. 2008-286756    Non-patent document 1: Ishii et al., “Proposal and Evaluation of Decommissioning Support Method of Nuclear Power Plants using Augmented Reality”, Transactions of the Virtual Reality Society of Japan, 13(2), pp. 289-300, June 2008    Non-patent document 2: Bruce G. Baumgart, “A polyhedron representation for computer vision”, Proceedings of the May 19-22, 1975, national computer conference and exposition, pp. 589-596, 1975