1. Field of the Invention
The present invention relates to an information processing apparatus, a control method therefor, and a program, and in particular, relates to a technique for extracting a local feature from an image.
2. Description of the Related Art
The implementation of retrieval using a local feature amount that is obtained by converting a local feature of an image into a numerical value is known as a configuration for focusing attention on an object in an image and retrieving a similar image (Japanese Patent Laid-Open No. 2006-65399). With this configuration, firstly, various types of filters (such as Gauss, Sobel, and Prewitt) are applied to a two-dimensional luminance distribution of an image so as to extract a feature point in the image. Next, a feature amount (local feature amount) regarding the feature point is calculated, using the feature point and the pixel values of its neighboring pixels. Image retrieval is performed by matching the local feature amount between an image to be a query and an image targeted for retrieval. Image retrieval is realized with steady precision through such processing, even if the image has been rotated or reduced or includes a partial cutaway or a hidden part.
A technique using a background subtraction method for selecting a region in which a feature point and a local feature amount are calculated is also known (Shunichirou Furuhata, Itaru Kitahara, Yoshinari Kameda, and Yuichi Ohta, “SIFT Feature Extraction in Selected Regions,” Proceedings of the 70th National Convention of Information Processing Society of Japan (2008); hereinafter referred to as “Furuhata et al.”). With this technique, assuming that a fixed camera is used to capture a physical object, differences between a captured image and a pre-captured background image are obtained so as to specify a foreground region. The technique reduces calculation cost by using the specified foreground region as a mask region and calculating a feature point and a local feature amount only within the mask region.
With the configuration described in Japanese Patent Laid-Open No. 2006-65399, a local feature candidate is extracted from the entire image, irrespective of the presence or absence of an object in the image or the object type. In other words, processing entailing high calculation cost, such as convolution processing performed when applying filters, is performed uniformly even on a region that has less likelihood of a local feature being extracted. Such ineffective processing may cause a decrease in processing speed.
Meanwhile, although the technique of Furuhata et al. is capable of reducing the calculation cost, it is necessary with the technique to prepare a background image in advance. For this reason, the technique cannot be applied to general image retrieval where no background image is prepared. In addition, processing for setting a mask region and processing for calculating a feature point and a local feature amount are performed independently and separately. Thus, it is feared that the calculation cost might be increased rather than being reduced, for example in a case where the range of a foreground region is widened.