This invention relates to object recognition and more particularly to identifying scale invariant features in an image and use of same for locating an object in an image.
With the advent of robotics and industrial automation, there has been an increasing need to incorporate computer vision systems into industrial systems. Current computer vision techniques generally involve producing a plurality of reference images which act as templates and comparing the reference images against an image under consideration, to determine whether or not the image under consideration matches one of the reference images. Thus, comparisons are performed on a full image basis. Existing systems, however, are generally accurate in only two dimensions and generally require that a camera acquiring an image of an object must be above the object or in a predetermined orientation to view the object in two dimensions. Similarly, the image under consideration must be taken from the same angle. These constraints impose restrictions on how computer vision systems can be implemented, rendering such systems difficult to use in certain applications. What would be desirable therefore is a computer vision system which is operable to determine the presence or absence of an object, in an image taken from virtually any direction, and under varying lighting conditions.
The present invention addresses the above need by providing a method and apparatus for identifying scale invariant features in an image and a further method and apparatus for using such scale invariant features to locate an object in an image. In particular, the method and apparatus for identifying scale invariant features may involve a processor circuit for producing a plurality of component subregion descriptors for each subregion of a pixel region about pixel amplitude extrema in a plurality of difference images produced from the image. This may involve producing a plurality of difference images by blurring an initial image to produce a blurred image and by subtracting the blurred image from the initial image to produce the difference image. Successive blurring and subtracting may be used to produce successive difference images, where the initial image used in a successive blurring function includes a blurred image produced in a predecessor blurring function.
Having produced difference images, the method and apparatus may further involve locating pixel amplitude extrema in the difference images. This may be done by a processor circuit which compares the amplitude of each pixel in an image under consideration, with the amplitudes of pixels in an area about each pixel in the image under consideration to identify local maximal and minimal amplitude pixels. The area about the pixel under consideration, may involve an area of pixels in the same image and an area of pixels in at least one adjacent image such as a predecessor image or a successor image, or both.
The method and apparatus may further involve use of a processor circuit to produce a pixel gradient vector for each pixel in each difference image and using the pixel gradient vectors of pixels near an extremum to produce an image change tendency vector having an orientation, the orientation being associated with respective maximal and minimal amplitude pixels in each difference image.
The plurality of component subregion descriptors may be produced by the processor circuit by defining regions about corresponding maximal and minimal amplitude pixels in each difference image and defining subregions in each of such regions.
By using the pixel gradient vectors of pixels within each subregion, the magnitudes of vectors at orientations within predefined ranges of orientations can be accumulated for each subregion. These numbers represent subregion descriptors, describing scale invariant features of the reference image. By taking images of objects from different angles and under different lighting conditions, and using the above process, a library of scale invariant features of reference objects can be produced.
In accordance with another aspect of the invention, there is provided a method and apparatus for locating an object in an image. A processor is used to subject an image under consideration to the same process as described above as applied to the reference image to produce a plurality of scale invariant features or subregion descriptors associated with the reference image. Then, scale invariant features of the image under consideration are correlated with scale invariant features of reference images depicting known objects and detection of an object is indicated when a sufficient number of scale invariant features of the image under consideration define an aggregate correlation exceeding a threshold correlation with scale invariant features associated with the object.
Consequently, in effect, correlating involves the use of a processor circuit to determine correlations between component subregion descriptors for a plurality of subregions of pixels about pixel amplitude extrema in a plurality of difference images produced from the image, and reference component descriptors for a plurality of subregions of pixels about pixel amplitude extrema in a plurality of difference images produced from an image of at least one reference object in a reference image.
Correlating may be performed by the processor circuit by applying the component subregion descriptors and the reference component descriptors to a Hough transform. The Hough transform may produce a list of reference component descriptors of objects within the image under consideration and a list of matching reference component descriptors from the library of scale invariant features. These lists may be applied to a least squares fit algorithm, which attempts to identify a plurality of best fitting reference component descriptors identifying one of the likely objects. Having found the best fitting subregion descriptors, the image from which the reference component descriptors were produced may be readily identified and consequently the scale and orientation and identification of the object associated with such reference component descriptors may be determined to precisely identify the object, its orientation, its scale and its location in the image under consideration.