In recent years, attention has been focused on a technology called augmented reality (AR) that presents additional information to the user by overlaying such information onto a real space. The information presented to the user by AR technology is also referred to as annotations and may be visualized using virtual objects in a variety of forms, such as text, icons, and animations. The laying out of annotations in an AR space is normally carried out based on recognition of the three-dimensional structure of a real space appearing in an image (hereinafter referred to as “environment recognition”). Known methods of environment recognition include SLAM (Simultaneous Localization And Mapping) and SfM (Structure from Motion), for example. The fundamental principles of SLAM are described in NPL 1 indicated below. According to SLAM, a set of feature points that are dynamically updated in keeping with changes in input images are used to simultaneously carry out recognition of the positions of feature points and recognition of the position and posture of the camera in the environment. With SfM, parallax is calculated from the positions of feature points appearing in a plurality of images picked up while the viewpoint changes and the environment is recognized based on the calculated parallax. PTL 1 discloses a method where the three-dimensional position of a feature point selected during initialization of SLAM is recognized using SfM. PTL 2 discloses an example of an AR application that may be realized by applying SLAM.