In recent years, as digital cameras and camera-equipped cell-phones are increasingly widespread and sophisticated, a user can easily form and utilize image data. With this, researches using image data have actively been made. One of them is a research for identifying three-dimensional objects included in images.
As a method of identifying a three-dimensional object by a computer process, there is a method of identifying a three-dimensional object by using a geometric shape of the object (e.g., see Non-Patent Literatures 1 and 2). There is also a method of identifying a three-dimensional object by utilizing a two-dimensional image formed by capturing the object (e.g., see Non-Patent Literatures 3 and 4). The present invention focuses on the latter, i.e., the method of identifying a three-dimensional object by utilizing a two-dimensional image, of the above-mentioned methods.
This method includes a method of using local features. The local feature describes a feature of a local region in an image with a vector. In general, a few hundred to a few thousand local features are acquired from various regions in one image. Therefore, even when only a part of an object is shown in a query image, or even when a part of an object is occluded, a corresponding three-dimensional object can be identified by using the part shown in the query image according to this method. The method using local features is robust for the case where the shooting condition is different, or for a transformation such as a similar transformation and rotation.
The simplest method of identifying a three-dimensional object by using local features is the method of shooting various objects beforehand, and storing local features extracted from these images in a database. An object is identified by comparing each local feature extracted from the query image and the local features stored in the database. In order to realize a highly-precise recognition of a three-dimensional object, the three-dimensional object has to be identified from a query image that is shot from any angles. Accordingly, it would be better that the object is shot from various angles, and local features are extracted from these images and stored in a database. However, if all of these local features are intended to be stored, a large memory has to be needed, which becomes a problem.
Various methods have been proposed so far for solving this problem. One of them is a method of selecting necessary local features, in order to reduce the number of the local features that are to be stored in a database for realizing a memory reduction (for example, see Non-Patent Literature 5).