This patent relates to the fields of visual pattern recognition, object recognition, image registration, and the matching of spatial patterns in two and higher dimensions.
S. E. Palmer, xe2x80x9cVision Science, Photons to Phenomenologyxe2x80x9d, MIT Press, Cambridge Mass., 1999.
S. Ullman, xe2x80x9cHigh-level Vision: Object recognition and visual cognitionxe2x80x9d, MIT Press, Cambridge Mass., 1996.
H. Bunke and B. T. Messmer, xe2x80x9cRecent advances in graph matchingxe2x80x9d, Int. J. Patt. Recog. Art. Intell., Vol. 11, No. 1, pp. 169-203, February 1997.
J. R. Ulmann, xe2x80x9cAn Algorithm for Subgraph Isomorphismxe2x80x9d, J. Assoc. Comput. Mach., Vol. 23, No. 1, pp. 31-42, 1976.
L. Shapiro and R. M. Haralick, xe2x80x9cA metric for comparing relational descriptionsxe2x80x9d, IEEE Patt. Anal. Mach. Int. Vol. 7, pp. 90-94, 1985.
J. E. Hummel and I. Biederman, xe2x80x9cDynamic binding in a neural network for shape recognitionxe2x80x9d, Psych. Rev., Vol. 99, pp. 480-517, 1992.
N. Ahuja, xe2x80x9cDot pattern processing using Voronoi Neighbourhoodsxe2x80x9d, IEEE Patt. Anal. Mach. Int. Vol. 4, pp. 336-343, 1982.
Y.-W. Chiang and R.-C. Wang, xe2x80x9cSeal identification using the Delaunay tessellationxe2x80x9d, Proc. Nat. Sci. Council, Rep. of China, Part A: Physical Sci. and Eng., Vol. 22, No. 6, pp. 751-757, November 1998.
G. Weber, L. Knipping, and H. Alt, xe2x80x9cAn application of point pattern matching in astronauticsxe2x80x9d, J. Symb. Comp., Vol. 17, No. 4, pp. 321-340, 1994.
M. Sambridge, J. Braun, and H. McQueen, xe2x80x9cGeophysical parameterization and interpolation of irregular data using natural neighborsxe2x80x9d, Geophysical Journal International Vol. 122, pp. 837-857, 1995.
Much work has been done in object recognition. S. E. Palmer""s, xe2x80x9cVision Science, Photons to Phenomenologyxe2x80x9d, MIT Press, Cambridge Mass., 1999, and S. Ulman""s xe2x80x9cHigh-level Vision: Object recognition and visual cognitionxe2x80x9d, MIT Press, Cambridge Mass., 1996, are works that contain more specific references to some of the basic topics on pattern recognition mentioned in this section.
Some useful pattern recognition algorithms are based on template matching. A basic description of template matching is as follows. A xe2x80x98squarexe2x80x99 template is a binary array: the edges of the square are given a value of xe2x80x981xe2x80x99 and the background and central portion of the square are given a value of xe2x80x980xe2x80x99. The edges of an input object are found and compared to the xe2x80x98squarexe2x80x99 template. If the position, shape, orientation, and scale of the input object are very near to that of the xe2x80x98squarexe2x80x99 template, then a correlation of the input object with the xe2x80x98squarexe2x80x99 template yields a high value indicating a match. If the input object is a square, but is at a slightly different position, orientation, or scale, template matching fails. How to match identical patterns that are transformed in position, rotation, and scale is a currently unsolved problem in the field of pattern recognition.
Other object recognition algorithms have been based on analysis of the Fourier spectra of objects. These algorithms are rotation invariant, but are not scale invariant. A specific problem with this approach is that often the entire image is used and objects are not segmented prior to analysis. Thus information about multiple objects is sometimes included in the Fourier spectra. Furthermore, if half of the object is occluded, this method of shape matching fails because large scale features, or low frequency components, will be different from those stored in the matching template.
One can also approximate the shape of an object using Fourier components. Fourier components are able to represent single closed contours efficiently using the coefficients of periodic functions. A measure for the distance between the two vectors containing those coefficients is used to determine a match. This approach is rotation invariant, but not scale invariant. Furthermore, it does not represent complex two dimensional objects readily.
Another approach is structural descriptions. Structural descriptions are interesting because they have the potential to perform position, rotation, and scale invariant pattern recognition. Structural descriptions consist of graphs in which the nodes indicate some feature of the object and the connections between nodes indicate spatial relationships such as up, down, right, left, top, bottom, and middle. For example, S. E. Palmer, in his book xe2x80x9cVision Science, Photons to Phenomenologyxe2x80x9d, MIT Press, Cambridge Mass., 1999, pp. 394, presents a structural graph for the letter xe2x80x98Axe2x80x99. The graph for this object contains 12 nodes and 14 edges. This is a surprisingly complex graph for such a simple object. The relationship between the complexity of the object and the size of the resulting graph is unknown. Furthermore, one can construct different graphs for this object. To date, there is no consistent and automatic method of generating these structural graphs.
Another complication of structural graphs is that matching graphs is itself a complex problem. For examples of various approaches to the graph matching problem in the computer vision domain, see H. Bunke and B. T. Messmer, xe2x80x9cRecent advances in graph matchingxe2x80x9d, Int. J. Patt. Recog. Art. Intell., Vol. 11, No. 1, pp. 169-203, February 1997, J. R. Ullmann xe2x80x9cAn Algorithm for Subgraph Isomorphismxe2x80x9d J. Assoc. Comput. Mach., Vol. 23, No. 1, pp. 31-42, 1976, and L. Shapiro and R. M. Haralick, xe2x80x9cA metric for comparing relational descriptionsxe2x80x9d, IEEE Patt. Anal. Mach. Int. vol. 7, pp. 90-94, 1985. All of these works contain algorithms that are complex and time consuming for objects with more than a few edges and nodes.
Other researchers have tried to encode spatial relationships with neural networks. J. E. Hummel and I. Biederman, xe2x80x9cDynamic binding in a neural network for shape recognitionxe2x80x9d, Psych. Rev., Vol. 99, pp. 480-517, 1992, use neural networks to encode spatial relationships between geons in the RBC (recognition by components) theory of object recognition. These neural networks encode features such as location, orientation, and scale. While this system works for simple objects, it is unclear how the system performs on multiple objects that contain many features. This is because the system has limited capability in representing different locations, orientations, and scale. Therefore the system may fail when making fine distinctions between objects with large numbers of features.
Several other researchers use a rigid mathematical procedure (e.g. Voronoi tessellation, Delaunay triangulation, Gabriel graphs, and minimal spanning trees) to create graphs from feature points and then perform graph matching. N. Ahija, in his paper, xe2x80x9cDot pattern processing using Voronoi Neighbourhoodsxe2x80x9d, IEEE Patt. Anal. Mach. Int. Vol. 4, pp. 336-343, 1982, suggests a relaxation labeling method to perform matching of dot patterns using Voronoi tessellation, a method both complex and time consuming for large objects. In Y.-W. Chiang and R.-C. Wang, xe2x80x9cSeal identification using the Delaunay tessellationxe2x80x9d, Proc. Nat. Sci. Council, Rep. of China, Part A: Physical Sci. and Eng., Vol. 22, No. 6, pp. 751-757, November 1998, the authors match histograms of the areas of resulting Delaunay triangles to recognize Chinese Seals. In G. Weber, L. Knipping, and H. Alt, xe2x80x9cAn application of point pattern matching in astronauticsxe2x80x9d, J. Symb. Comp., Vol. 17, No. 4, pp. 321-340, 1994, the authors use a Delaunay triangulation between stars and then look for matches in the slope and length of edges between pairs of stars, thus their algorithm is not rotation or scale invariant. A similar matching method is proposed in N. P. Chotiros"" U.S. Pat. No. 4,891,762 entitled xe2x80x9cMethod and apparatus for tracking, mapping and recognition of spatial patternsxe2x80x9d. Chotiros"" algorithm matches only lengths of edges in a Delaunay triangulation and thus achieves invariant matching with respect to rotation, but not with respect to scale.
In U.S. Pat. No. 6,181,806 issued to Kado et al, entitled xe2x80x9cApparatus for identifying a person using facial featuresxe2x80x9d, a method is described for recognizing faces based on matching the brightness of patches of an object with those stored in memory. The authors use triangles as the patches, but they do not use angle information, or adjacency relationships to perform spatial matching of a triangular network. Their algorithm is not scale or rotation invariant.
In Akira et al, U.S. Pat. No. 4,783,829 entitled xe2x80x9cPattern Recognition Apparatusxe2x80x9d, the authors propose a method which matches polygons that are similar in area, thus creating an algorithm that is not scale invariant.
My method is based on matching of polyhedra and their appropriate adjacent and neighboring polyhedra in a tessellation of feature points. For concreteness, let the tessellation be the Delaunay triangulation in two dimensions and let us consider image data. Feature points consist of the position of the feature and labels describing that feature. The labels can be xe2x80x98cornerxe2x80x99, xe2x80x98line terminationxe2x80x99, or any other description of an image feature. The method then performs a search for matching triangles between the Delaunay triangulation of input feature points and the Delaunay triangulation of template feature points stored in memory. Triangles match if there is sufficient similarity in the angles and labels associated with each node, or feature point, in the triangle. Matching a single triangle represents the significant event of matching three node labels and their spatial relationships. A positive match between the angles and nodes of one triangle and its three adjacent triangles results in a total of twelve matching angles and six node labels. This is a highly discriminating and therefore useful method for performing spatial pattern recognition. Additional neighboring triangles can be examined for increased confidence of match. This method is capable of performing spatial pattern matching independent of transformations in position, rotation, and scale. Most other pattern recognition techniques do not have these properties. Lastly, a high probability of match can be achieved using a small set of adjacent and neighboring triangles and is capable of yielding a positive match even if the object is significantly occluded.
There are many ways to create tessellations. For example, we note the several common forms of tessellation: the Voronoi tesselation, the Delaunay triangulation, and the Delaunay triangulation with constraints, such as minimum weight triangulation, angular balanced triangulation, area balanced triangulation. Also many forms of tessellation are grouped under the generic labels of unstructured grid generation, mesh generation, triangulations based on quadtrees, or polyhedralization based on octrees, quadrangulation, hierarchical adaptive tesselation, adaptive polygonal tesselation, and Steiner tetrahedralization. Many of these algorithms have equivalents in three and higher dimensions.