The present invention relates to a local feature descriptor extracting apparatus, a local feature descriptor extracting method, and a program.
To enable robust identification of an object in an image with respect to variations in photographed size and angle and to occlusion, systems are proposed which detect a large number of interest points (feature points) in the image and which extract a feature descriptor of a local region (a local feature descriptor) around each feature point. As representative systems thereof, Patent Document 1 and Non-Patent Document 1 disclose local feature descriptor extracting apparatuses that use a SIFT (Scale Invariant Feature Transform) feature descriptor.
FIG. 23 is a diagram showing an example of a general configuration of a local feature descriptor extracting apparatus that uses a SIFT feature descriptor. In addition, FIG. 24 is a diagram showing a conceptual image of SIFT feature descriptor extraction by the local feature descriptor extracting apparatus shown in FIG. 23.
As shown in FIG. 23, the local feature descriptor extracting apparatus includes a feature point detecting unit 200, a local region acquiring unit 210, a subregion dividing unit 220, and a subregion feature vector generating unit 230. The feature point detecting unit 200 detects a large number of interest points (feature points) from an image and outputs a coordinate position, a scale (size), and an orientation of each feature point. The local region acquiring unit 210 acquires a local region to be subjected to feature descriptor extraction from the coordinate position, the scale, and the orientation of each detected feature point. The subregion dividing unit 220 divides the local region into subregions. In the example shown in FIG. 24, the subregion dividing unit 220 divides the local region into 16 blocks (4×4 blocks). The subregion feature vector generating unit 230 generates a gradient direction histogram for each subregion of the local region. Specifically, the subregion feature vector generating unit 230 calculates a gradient direction for each pixel in each subregion and quantizes the gradient direction into eight directions. Moreover, the gradient direction that is calculated at this point is a relative direction with respect to an orientation of each feature point that is outputted by the feature point detecting unit 200. In other words, the gradient direction is a direction that is normalized with respect to an orientation outputted by the feature point detecting unit 200. In addition, the subregion feature vector generating unit 230 sums up frequencies of the eight quantized directions for each subregion and generates a gradient direction histogram. In this manner, gradient direction histograms of 16 blocks×8 directions that are generated with respect to each feature point are outputted as a 128-dimension local feature descriptor.
Furthermore, Patent Document 2 discloses a method of narrowing down calculation objects of local feature descriptor to feature points having a high reproducibility of extraction even when an image is subjected to rotation, enlargement, reduction, or the like in order to improve search accuracy and recognition accuracy when using a local feature descriptor.    Patent Document 1: U.S. Pat. No. 6,711,293    Patent Document 2: Patent Publication JP-A-2010-79545    Non-Patent Document 1: David G. Lowe, “Distinctive image features from scale-invariant keypoints”, USA, International Journal of Computer Vision, 60 (2), 2004, pages 91-110