The present invention relates to a technique to extract feature quantities from an image to perform location estimation.
There is an increasing demand for techniques to estimate in-door locations for analyzing customer traffic lines in places such as shopping centers and public facilities and for inventory control in these years.
It is difficult to accurately estimate a location indoors because a wireless device such as a WiFi or RFID device when used indoors generates much noise. GPS provides a relatively accurate location estimation but cannot be used indoors. Therefore a technique has been developed in which a camera is attached to a moving object to shot an image and the shot image is analyzed to predict the location.
Existing techniques related to this include techniques disclosed in WO2008/087974. WO2008/087974 relates to a technique to calculate positional relationship between multiple cameras and a technique to generate a user interface based on the calculated positional relationship and discloses calculation of the positions of multiple cameras based on a shot image without using GPS.
Hironobu Fujiyoshi, “Gradient-Based Feature Extraction: SIFT and HOG”, Information Processing Society of Japan Technical Report, CVIM [Computer Vision and Image Media], 2007, (87), 211-224, 2007-09-03 provides a summary of the SIFT algorithm and describes detection of extrema in a DoG image and HOG.
JP2011-53181A discloses orientation estimation using SIFT (Scale-Invariant Feature Transform) feature quantities in order to provide a reliable information terminal apparatus capable of estimating the orientation of a user and controlling display information on a display according to the result of the estimation without needing incorporation of a geomagnetic sensor.
SIFT is one of image feature extraction techniques and relates to detection of feature points and a method for calculating feature vectors around a feature point. Aside from this, Structure from Motion (SfM), which is a technique to reconstruct a three-dimensional positional relationship between singularity points and camera positions from an image is known. SfM is described in JP10-40385A, JP2009-237845A and JP2009-237847A, for example.
SfM is a technique in which feature points are extracted from an image, such as corners, and the intensities of 8×8 pixels around a feature point are arranged to simply generate a 64-bit feature vector or otherwise to calculate feature vectors from pixels around the feature point and matching is performed between a plurality of pixels. In this way, the three-dimensional position of the feature point and the position of the camera can be reconstructed at the same time.
To stably obtain SfM, a feature quantity calculation method that is robust to noise, that is, robust to fluctuations in an image, and robust to rotation and scaling of the image is required. However, the conventional methods have the problem of high calculation costs. To obtain SfM, generally an amount of calculation proportional to the number of pixels is required and therefore considerations to the calculation cost are essential.