A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
The present invention relates to image detection, image recognition and other types of computer vision. More particularly, the present invention provides a system with improved ability to classify objects in an image.
Image detection, image recognition and computer vision in general refer to abilities of electronic systems to detect the presence of predefined objects within an image and, if appropriate, to then take actions based upon that detection. Present day applications of these systems include fingerprint, hand or retina identification, identification of faces out of crowd photos or of types and numbers of enemy aircraft from a satellite photo, and similar applications. Some of these systems are xe2x80x9cimage detectionxe2x80x9d systems, while others can be xe2x80x9cimage recognitionsxe2x80x9d systems. An image detection system generally does not need to recognize a specific match, but just simply the fact that there is some sort of match; for example, an infrared fingerprint scanner used to open a door lock might simply determine whether a scanned fingerprint is present in a fingerprint directory, without determining exactly which fingerprint in the directory was matched. An image recognition system might go further, e.g., matching a fingerprint with a fingerprint of a specific criminal. These examples are illustrative only, and there are many applications of image detection, image recognition and computer vision.
The aforementioned systems usually require one or more xe2x80x9ctargetxe2x80x9d images, which can be acquired from a camera, infrared scanner, stored image, satellite or other source. In addition, these systems also generally require a directory (or database) of known images that a computer is to search for within a target image. The computer can use various methods to perform this searching, including intensity or feature based methods, among others. The computer can first pre-process the target image, to identify xe2x80x9ccandidatexe2x80x9d portions which might contain a match with a directory image. If desired, the computer can then focus on the candidate portions and limit further processing to such portions only.
The processing required of the computer can become quite complex if the directory is large; for example, a typical directory might have several thousand images. Since a target image can feature distortions such as light variation, shading effects, perspective orientation and other circumstances that render detection and recognition difficult, the processes and time required of the computer can be enormous and perhaps too demanding for real-time processing. A common design goal, therefore, is to produce systems which perform very efficiently and are less computationally demanding.
Systems typically use vectors to describe directory images; the vectors can represent individual features, or the vectors can describe other aspects of directory images. Each directory image is represented by a different combination of the vectors, and these systems generally use mathematical shortcuts whereby they can process the target image all at once, for all directory images (instead of searching the target image again and again, once for each directory image).
Some systems which use vectors in this manner use xe2x80x9ceigenvectors.xe2x80x9d These vectors are derived from pre-processing of all the directory images, and they share the property that each eigenvector is independent from every other eigenvector. In this context, each eigenvector is simply a set of numbers, perhaps as many as four hundred numbers or more, one each for each pixel of the largest directory image. For example, if the directory had images that were each twenty pixels by twenty pixels, each directory image can be said to be represented by a vector having four hundred numbers. Eigenvectors would be derived in this instance by a computer subroutine which determines commonalities and difference between different images in the directory, with an eigenvector for each different dimension of image commonalities and differences; the eigenvectors would in this instance also have four hundred numbers and could also be thought of as a sort of building block image. A value related to each eigenvector is an eigenvalue, that is, a single value indicating how strong or significant the associated eigenvector is across all images of the directory. In a different vector system (where, for example, each vector could represent a feature of an image, e.g., a nose, an eye, etc.), an analogous value might be how important the associated feature is in describing the images in the directory.
A system might then use vectors (such as eigenvectors or other vectors) to perform image recognition and detection as appropriate. For example, the vectors might first be applied to a target image to select candidate portions, that is, objects in the image which are believe sufficiently xe2x80x9cclosexe2x80x9d to the directory images so that a match is likely. To perform this processing, typically the strongest or most significant vector is multiplied against different groups of pixels in the target image; if the result is sufficiently large, a match is likely. Image portions screened in this manner can then have the second strongest vector applied, and so on, until it is very likely that screened portions are also found in the directory. If it is desired to perform image recognition (as opposed to just detection), a further pre-processing step exists where the contribution of each vector to a particular directory image is ascertained and stored, as a vector signature for the image. When processing a target image, the target is processed using the vectors and then it is ascertained whether results of processing match specific vector signatures. Many image recognition systems will use both of these processes; that is, a system might first process a target image to screen candidate portions of the target image, and then perform image recognition only for those candidate portions.
One problem with some vector based systems is their tendency to detect false matches. That is to say, some systems will detect matches in a target image where the human eye, by contrast, can readily observe that there is no match. One reason this results occurs is that basis vectors are typically selected only for detecting a match, but usually not to reject false matches or for affinity to screen out non-matches. A related problem relates to vector multiplication, mentioned above, and the application of thresholds to the results to detect matches; strong or significant vectors multiplied against portions of the target can produce high values even though there truly is no xe2x80x9cmatch,xe2x80x9d and a suitable threshold is needed to distinguish these results. However, the target image may be produced under different conditions, and thresholds can vary between target images and across portions of each target image. Thus, with some traditional systems, proper threshold selection is important and errors in a proper threshold can result in false matches or failure to detect a match.
What is needed is an image classification system, usable in image recognition and computer vision, which has an improved ability to quickly reject false matches. Ideally, such an image classification system should operate using a quick process (such as vector multiplication), but a process that usually produces large results, except for directory matches which produce zero or near zero results. Further still, such a system should ideally use a relatively small number of vectors, thereby requiring less processing. The present invention satisfies these needs and provides further, related advantages.
The present invention solves the aforementioned needs by providing a system usable in image detection, image recognition and other types of computer vision. Unlike conventional systems where xe2x80x9cbasisxe2x80x9d vectors used for detection are strongly correlated with images sought in a target, and large varying thresholds are used to detect matches, the present invention chooses vectors which are xe2x80x9canti-correlatedxe2x80x9d with directory images; that is to say, the preferred vectors are ideally strongly correlated with everything appearing in a target image except directory images, and are anti-correlated with directory images. In this manner, a system using the present invention has-improved ability to reject false matches and to detect true matches using near zero values, which reduces problems in selecting a proper threshold for each target image. The present invention generally requires fewer vectors for reliable image detection and recognition which, with improved false match rejection, provides for more efficient, faster image detection, image recognition and other computer vision.
A first form of the invention provides a method of classifying a target image or portion thereof using directory of image examples. This method processes images in the directory to derive a set of component vectors. A set of xe2x80x9cbasisxe2x80x9d vectors is then chosen using these component vectors and applied to the target image to classify at least a portion of that image. This classification can be screening of the target image to identify candidate portions where a match is likely, or image detection or image recognition. To select the subset, the method performs an operation upon at least some vectors to determine their strength in the directory. The method also performs an operation upon at least some vectors to determine relative smoothness. The basis vectors are then chosen to include only xe2x80x9cweak, smoothxe2x80x9d vectors, which are then used to classify images.
In more particular aspects of this first form of the invention, the vectors can be eigenvectors which are specifically chosen to produce zero or near zero results when convolved with a match in the target image; in this manner, there is reduced difficulty in computing thresholds. Rather, the system simply identifies convolution results which are very close to zero, through several iterations if desired. The invention can be applied both to image detection systems and image recognition systems, or to other forms of computer vision.
A second form of the invention provides an improvement in template matching systems, wherein relatively xe2x80x9csmooth, weakxe2x80x9d vectors are used to classify images. A third form of the invention provides a method for classifying images where vectors are used which produce substantially zero results for matches only, and results which are substantially large (or negative) for other image portions.
The invention may be better understood by referring to the following detailed description, which should be read in conjunction with the accompanying drawings. The detailed description of a particular preferred embodiment, set out below to enable one to build and use one particular implementation of the invention, is not intended to limit the enumerated claims, but to serve as a particular example thereof.