Images that are not simply one color contain at least one edge and usually many edges. Edges are abrupt changes in the brightness of the image. For example, if an image contains a red airplane in a gray and cloudy sky, there are edges where the airplane outline meets the blue sky. Since the brightness distinction between these two objects is likely very high, the edges are considered to be strong. However, for the gray-white clouds and the gray sky, there is still an edge at the border of each cloud, but they will likely be considered weak since they are not very distinct. Since edges are able to be represented mathematically, they are able to be used in image comparison as well as for other image based purposes.
Mallat and Zhong teach a method of edge-wavelet multiresolution analysis for image representation and compression in their paper, “Characterization of signals from multiscale edges,” published in IEEE Transactions on Pattern Analysis and Machine Intelligence, 14, 7, 710-732, 1992. Image compression is achieved in their work by finding and keeping only the local extrema in their edge-wavelet coefficients. Finding local extrema of image gradients as an edge detector is commonly known as Canny's edge detector proposed by F. Canny, in his paper, “A computational approach to edge detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-8, 6, 679-698, November 1986. Mallat and Zhong also teach a method of image reconstruction from edge-wavelet extrema by iterative projection onto feasible solution sets. The iterative projection method is too computationally intensive, and hence too time-consuming to be of practical use on commonly available image processing computers such as personal computers. It currently takes several hours to process a single image using the iterative projection method.
Previous attempts at image comparison/matching have many drawbacks. Many prior attempts did not take a transform of the image; they simply took the sign of coefficients from the images and then compared the signs of two images adding to a running total for every match to achieve a score. Depending on the score, there either was or was not a match. In addition to being computationally expensive, typically they focus on one aspect of the images and ignore the rest. For example, the method described above by Mallat and Zhong focuses only on edges of the images. Thus, if two images have similar edges, they are viewed as matching. However, there are clearly problems with only using the edges for matching. For instance, a person's head and a volleyball are both round and similar in size, thus will have similar edges. However, when a person is searching for an image of a volleyball, the search would be ineffective if images of people interfered with the search.