In recent years, a system that displays an image by using an augmented reality (AR) technology has become popular (see, for example, Patent Document 1). As an example of the AR technology, an object is photographed by using a camera mounted on a personal computer (PC), a portable terminal device, or the like, and the position and posture of the camera in a three-dimensional space are estimated from an image of the object. Content information is superimposed and displayed in an arbitrary position within the image by using the determined position and posture of the camera as a reference.
FIG. 1 illustrates an example of a method for obtaining the position and posture of a camera by using feature points included in a captured image. In this method, a three-dimensional map 201 indicating a set of three-dimensional coordinates of map points 211 on an object is generated in advance.
When an image 202 is captured, the map points 211 are projected onto the image 202 by using a transformation matrix M for transforming a three-dimensional coordinate system 203 into a camera coordinate system 204 such that projection points 212 are obtained. The position and posture of the camera in the three-dimensional coordinate system 203 are estimated by associating the projection points 212 with feature points 213 detected from the image 202. As an example, the position of the camera is indicated by a relative position of the camera coordinate system 204 with respect to the three-dimensional coordinate system 203, and the posture of the camera is indicated by a relative angle of the camera coordinate system 204 with respect to the three-dimensional coordinate system 203.
Three-dimensional coordinate Sp of a map point p, two-dimensional coordinate xp′ of a projection point that corresponds to the map point p, and two-dimensional coordinate xp of a feature point that corresponds to the map point p are respectively expressed according to the expressions below.Sp=(x,y,z)  (1)xp′=(u′,v′)  (2)xp=(u,v)  (3)
In this case, the sum of squares E of a distance between the projection point and the feature point on the image is expressed by the expression below.
                    E        =                              ∑            p                                                          ⁢                                                                                    xp                  ′                                -                xp                                                    2                                              (        4        )            
The position and posture of the camera are determined by obtaining a transformation matrix M that minimizes the sum of squares E in expression (4).
FIG. 2 illustrates an example of a method for generating the three-dimensional map 201. In this generation method, stereoscopic photographing and stereoscopic measurement are used. An image 301 and an image 302 that are respectively captured from a photographing position 311 and a photographing position 312 are used as key frames, and a feature point 313 in the image 301 and a feature point 314 in the image 302 are associated with each other such that a map point 315 in a three-dimensional space is restored. A plurality of map points are restored by associating a plurality of feature points in an image with a plurality of feature points in another image, and a three-dimensional map 201 indicating a set of the plurality of map points is generated.
Technologies, such as a polyhedron representation for computer vision, visual tracking of structures, and machine perception of three-dimensional solids, are also known (see, for example, Non-Patent Document 1 to Non-Patent Document 3).    Patent Document 1: Japanese Laid-open Patent Publication No. 2015-118641    Non-Patent Document 1: Bruce G. Baumgart, “A polyhedron representation for computer vision”, Proceedings of the May 19-22, 1975, national computer conference and exposition, pp. 589-596, 1975    Non-Patent Document 2: Tom Drummond and Roberto Cipolla, “Real-Time Visual Tracking of Complex Structures”, IEEE Trans. Pattern Analysis and Machine Intelligence, pp. 932-946, 2002    Non-Patent Document 3: L. G. Robert, “Machine perception of three-dimensional solids”, MIT Lincoln Lab. Rep., TR3315, pp. 1-82, May 1963