A minimum edit distance is a well-know and powerful concept widely used in comparing strings of data, i.e., data that is capable of being represented as a one-dimensional string of symbols, such as text or DNA or protein sequences.
The minimum edit distance between two strings, also known as the Levenstein distance, is defined as the weighted sum of deletions, insertions and substitutions required to transform the one string into the other string. The minimum edit distance provides a measure of how similar the two strings are to each other.
Comparing strings of data by finding the minimum edit distance between them has been found to be useful in a wide variety of problems including, but not limited to, searching in texts, correcting spelling, matching DNA sequences, speech recognition and handwriting analysis. As a result there is a rich array of related techniques and algorithms including methods for rapidly calculating minimum edit distances.
Use of the minimum edit distance technique in searching and matching DNA and protein sequences is described in detail in, for instance, U.S. Pat. No. 5,701,256 issued to Marr et al. on Dec. 23, 1997 entitled “Method and apparatus for biological sequence comparison”, the contents of which are hereby incorporated by reference. The application of minimum edit distance techniques to text searching and matching, particularly spelling correction, is described in, for instance, U.S. Pat. No. 6,616,704 issued to Birman, et al. on Sep. 9, 2003 entitled “Two step method for correcting spelling of a word or phrase in a document” the contents of which are hereby incorporated by reference.
Attempts have been made to extend the minimum edit distance concept to images to solve problems in comparing images. One approach to obtaining an image edit distance, for instance, has been to first create a histogram of hues form the image. An edit distance may then be determined between the histograms of the images as described in, for instance, a technical report by S. Cha et al. entitled “Algorithm for the edit distance between angular type histograms” in the proceeding of the International Society for Optical Engineering proceedings (SPIE) on Storage and Retrieval for Media Databases 2003 (Santa Clara Calif., 22-23 Jan. 2003), the contents of which are hereby incorporated by reference. Another approach to creating an image edit distance has been to first create containment graphs representing features within an image and their spatial relationship to each other, and to then obtain edit distances between the graphs, as described in, for instance, a paper by Kailing et al entitled “Content-Based Image Retrieval Using Multiple Representations” published in the Proc. 8th Int. Conf. on Knowledge-Based Intelligent Information and Engineering Systems (KES'04), Wellington, New Zealand, LNAI 3214, pp 982-988, 2004, the contents of which are hereby incorporated by reference.
The histogram based attempts at creating image edit distances do not adequately reflect the spatial relations of features within an image, while the graph based efforts tend to be difficult to generalize and automate. In effect, these attempts at extending the image edit distance to images begin by transforming the image into a one dimensional representation and then apply the edit distance to that one dimensional representation.
What is needed is a system and method for determining an edit distance between arbitrary images that adequately reflects the spatial relations of features within the images and that can be calculated automatically and easily and is preferably not dependent on first transforming the image into a one dimensional relationship. Such an image edit distance is likely to be of great value in comparing images and useful in applications such as, but not limited to, pattern or object recognition, including face recognition, image classification and video tagging.