The invention relates generally to the field of digital image processing. More specifically, the invention relates to a method and system for representing the content of an image so that it can be matched with another image containing the same content.
Image matching is a fundamental technique that is being used in computer vision, object recognition, motion tracking, 3D modeling, and the like. Image matching is performed to check whether two images have the same content. The two images that need to be compared may not be exactly the same. For example, one image may be rotated or taken from a different viewpoint as compared to the other image, or it may be a zoomed version of the other image. Further, the two images may be taken under different lighting conditions. Despite such variations in the two images, they contain the same content, scene or object. Therefore, image matching techniques are used to match images effectively.
Typical image matching algorithms take advantage of the fact that an image of an object or scene contains a number of feature points. Feature points are specific points in an image which are robust to changes in image rotation, scale, viewpoint or lighting conditions. This means that these feature points will often be present in both the images, even if the two images differ in the manner described earlier. These feature points are also known as ‘points of interest’. Therefore, the first stage of the image matching algorithm is to find these feature points in the image. Typically, an image pyramid is constructed to determine the feature points of an image. The image pyramid is the scale-space representation of the image, i.e., it contains various pyramid images, each of which is a representation of the image at a particular scale. The scale-space representation enables the image matching algorithm to match images that differ in overall scale.
After determining the feature points of all pyramid images in the image pyramid, typical image matching algorithms determine the orientation of each feature point. The orientation of a feature point is determined based on the local image gradient at the feature point. Orientation is used to obtain invariance to rotation. After the feature points and their orientations are determined, a patch is extracted around the feature point in such a way that the orientation vector forms one axis of the reference frame for the patch. The local image gradients on this patch are calculated and transformed into a feature vector representation. The feature vector representation takes into account significant change in local distortion and illumination, i.e., the feature vector is invariant to distortion and change in lighting conditions.
The feature points, their orientations and their feature vectors over all pyramid images form a complete representation of the image. These representations can be compared across images to find a matching image.
There are various limitations associated with the existing methods for representation of images. First, the image may contain a large number of feature points. Some of these feature points are less significant than others in the representation of images and unnecessarily increase the complexity of the image matching algorithm. Second, different methods exist for determining the orientation of a feature point and these methods produce different results. Therefore, any single method cannot be relied on to determine the orientation. Further, if two or more orientations of a feature point are produced, they increase the complexity of the image matching algorithm. Third, sampling algorithms used to extract a patch around the feature point are not sensitive to the actual scale. Therefore, these methods do not take into account the fact that patch size should increase as scale increases even if the image size stays constant. Fourth, the patches around the boundary of the image are not extracted because the patches may extend beyond the image boundary. This is undesirable since boundary patches often make a significant contribution to the overall image matching algorithm, especially as the image size decreases. Finally, some components present in a feature vector of a feature point may be large due to an edge passing through the patch. Such a feature vector is not robust to changes in illumination. Existing methods improve robustness by normalizing the feature vector but do not guarantee it numerically.
There exists a need for an improved image-representation method for overcoming the numerous limitations mentioned above.