Field of the Invention
The present invention relates to the field of the analysis of images.
Description of the Prior Art
In the field of image analysis, before submitting an image formed by a plurality of points (pixels)—each characterized by a respective value of a physical parameter representative of the image, such as the luminance—to some types of processing—such as the comparison with another image—, it is advantageous to perform the identification of the position and size of the salient details represented in this image. In the field of image analysis, with “salient detail” of an image it is intended a portion of an object included in the image that is easily detectable even in the presence of changes in the point of view of the same object, in the lighting and in the type of camera.
Until a few years ago, it was possible to identify the position of the salient details of an image, but not their size. More in detail, the identification of the location of a salient detail of an image is performed through the identification of an associated salient point of the image—in the jargon, keypoint—, which substantially corresponds to the center of the salient detail. In the case of a detail having a circular shape, the keypoint coincides with the center of the detail, while in the case of details having different shapes, the position of the keypoint may diverge from the actual center of the detail.
Recently, in addition to image keypoint identification, procedures have been developed, thanks to which it is also possible to determine the size of the salient detail associated with each keypoint.
Currently, the methods used to identify the position and size of the salient details are based on the concept of “scale-space”, which provides for the application of a series of gradually more intense filterings to the image. The filterings applied to the image are typically filterings that perform differential operations on values of the physical parameters (e.g., luminance) of the image points. Typically, such filterings are based on the Gaussian function, the filtering intensity of which is governed by a filtering parameter σ (the standard deviation of the Gaussian function): the higher the filtering parameter σ is, the flatter and wider the Gaussian is, and a more intense smoothing effect the Gaussian has. The scale-space of an image formed by a matrix of pixels of coordinates (x, y) is the space formed by the set of filtered images (in terms of luminance) obtained from the starting image applying gradually more intense filters—i.e. with gradually larger values of σ—and is therefore a three dimensions (x, y, σ) space.
The theory (see for example T. Lindeberg (1992), “Scale-space behavior of local extrema and blobs”, J. of Mathematical Imaging and Vision, 1 (1), pages 65-99) states that if you have an extreme value—with respect to σ—of the filtered image for a point (xp, yp, σp) belonging to the space (x, y, σ), i.e., a maximum or a minimum—with respect to σ—in a portion of the space (x, y, σ) surrounding the point (xp, yp, σp), then that point is associated with a salient detail, whose center coordinates are (xp, yp), and the size is proportional to σp. The size (diameter) of the detail (in units or pixels) is equal to 2*sqrt(2)*σp.
By identifying all the extreme points in the scale-space, the position and size of the salient details in the image it is therefore obtained.
To find the extreme points in scale-space, known methods (such as the method that uses the descriptor “Scale-Invariant Feature Transform”, SIFT, described in the 1999 in the article Object recognition from local scale-invariant features of Lowe, David G., Proceedings of the International Conference on Computer Vision 2. pages 1150 to 1157 and the subject of U.S. Pat. No. 6,711,293), consider a sequence of filtered images with increasing values of σ, and, for each point of an image filtered with a σ, compare their values with the values of the eight adjacent points of the same image and the values of the 18 (9+9) adjacent points present in the filtered images corresponding to the previous and next values of σ in the sequence. If this point is less than or greater than all the adjacent ones, then the point is an extreme of the space x, y, σ, and is a candidate to be a keypoint. This point is just a candidate because it is known (see, for example Lowe, D G, “Distinctive Image Features from Scale-Invariant Keypoints”, International Journal of Computer Vision, 60, 2, pages 91-110, 2004) to eliminate the points corresponding to portions of the image having low contrast and the points that lie on structures similar to edges, since the location of a detail along an edge can easily vary in different images that depict the same scene. The point is therefore not reliable and is therefore discarded.