The present application is directed a computer operable system and method which incorporates a software program and algorithm for identifying stable keypoints in an image. the identified stable keypoints may be used in a variety of applications, including finding a target image in a large collection of images based on a query image, in motion detection, image matching, object recognition, tracking, image mosaic, panorama stitching, 3D modeling, and surface mapping, among others.
A wide variety of image keypoint detectors have been proposed in the literature. One in particular is described in an article by C. Harris and M. Stephens, entitled “A Combined Corner and Edge Detector”, Proceedings of the Alvey Vision Conference, pp. 147-151, 1988, and is known in the art as a Harris corner detector. This type of detector is based on a scale-adapted second moment matrix, otherwise known as autocorrelation. The matrix describes the gradient distribution in a local neighborhood of a point. The local derivatives are computed with Gaussian kernels of varying size, and then smoothed by another Gaussian window, i.e., the integration scale. The eigenvalues of the autocorrelation matrix represent the two principal curvatures in the neighborhood of a point. Keypoints are selected when both curvatures are high, which represents corners or junctions in two-dimensions (2D). However, it is well known the Harris corner detector is not scale invariant.
There are a few approaches that are generally invariant to scale changes. Lindeberg in “Feature Detection with Automatic Scale Selection,” International Journal of Computer Vision, Vol. 30, No. 2, pp. 79-116, 1998, proposes to search in a three-dimensional (3D) scale-space representation using a pyramid of Gaussian filters such as Laplacian of Gaussian and other derivatives, and detect a feature point when the local 3D peak absolute value exceeds a certain threshold. The scale-space representation is built by successive smoothing of the high resolution image with Gaussian kernels of different size. The Laplacian of Gaussian is circularly symmetric and detects blob-like structures.
K. Mikolajczyk and C. Schmid in the articles “An Affine Invariant Interest Point Detector,” in European Conference on Computer Vision and Pattern Recognition, Vol. 1, pp. 128-142, 2002, and “A Performance Evaluation of Local Descriptors,” in Conference on Computer Vision and Pattern Recognition, pp. 257-263, June 2003, proposed robust scale-invariant detectors based on the Harris-Laplace and Hessian-Laplace operators. They used the determinant of the Hessian matrix to select the location and the Laplacian to select the scale.
One method called Scalable Invariant Feature Transform (SIFT) is described by D. G. Lowe in “Distinctive Image Features from Scale-Invariant Keypoints,” International Journal of Computer Vision, Vol. 60, No. 2, pp. 91-110, 2004, and Matthew Brown and D. G. Lowe in “Invariant Features from Interest Point Groups,” in British Machine Vision Conference, BMVC 2002, Cardiff, Wales, pp. 656-665, September 2002. In an effort to improve the speed, SIFT further approximates the Laplacian of Gaussian (LoG) by a Difference of Gaussian (DoG) image pyramid. The input image is successively smoothed with a Gaussian kernel and down-sampled. The Difference of Gaussian is obtained by subtracting two successive smoothed images. Thus all the DoG levels are constructed by combined smoothing and sub-sampling. The local 3D maxima in the image pyramid representation of spatial dimensions and scale determine the localization and scale of a keypoint.
Ke and Sukthankar, in “PCA-SIFT: A More Distinctive Representation for Local Image Descriptors,” in Conference on Computer Vision and Pattern Recognition, pp. 111-119, 2000, proposed a Principal Components Analysis-Scalable Invariant Feature Transform (PCA-SIFT) that uses the SIFT keypoint detector in conjunction with Principal Component Analysis (PCA) to reduce feature dimensionality. Ledwich and Williams, in “Reduced SIFT Features for Image Retrieval and Indoor Localization,” IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol. 19, No. 5, May 1997, proposed a reduced SIFT feature set for mobile robot applications where rotation invariance is not necessary.
A common problem of SIFT and the other DoG and/or LoG-based approaches is that the local maxima are often detected in the neighborhood of contours or straight edges, where the signal only changes in one direction while the other direction is relatively flat. Keypoints from such neighborhoods are not stable because their localization is more sensitive to noise and/or small changes in nearby texture. A more sophisticated approach to solve this problem is the search for simultaneous maximum of both the trace as well as the determinant of the Hessian matrix. The addition of the Hessian determinant penalizes points for which the second derivative changes in one direction only. However, the need to calculate both the determinant and trace of the Hessian matrix greatly increases the computational complexity of this solution.
A particular issue with regard to keypoint detectors is the degree of robustness to noise and variations in rotation, scale and other common image degradations at which the keypoint detector is required to operate.
The existing methods above such as SIFT and PCA-SIFT require a considerable amount of computations which limit the overall performance. The present application discloses a system and method that offers performance improvement over the existing techniques and is just as effective in identifying the best keypoints in an image.