The present invention relates to automatic object recognition and, more particularly, but not exclusively to a method and apparatus for determining similarity between surfaces, e.g., for matching purposes.
Matching of surfaces has recently become an important task of computer vision and is needed in a variety of applications, such as biometric, security and medical applications. In security systems, for example, surface matching can be used for face recognition, so as to grant or deny access to select individuals, to alert when an individual is recognized, or to track an individual as the individual travels amongst a plurality of people. In like manner, home automation systems can employ surface matching to distinguish among residents of a home, so that the features of the system can be customized for each resident. In medical applications, surface matching can be employed for registration an image scan so as to provide a basis for localizing or tracking a medical instrument with respect to anatomical features or other elements in the image.
Humans have a remarkable ability to identify objects, such as faces of individuals, in a rapid and seemingly effortless fashion. It develops over several years of childhood and results in the intelligence to recognize thousands of faces and other objects throughout our lifetime. This skill is quite robust, and allows humans to correctly identify others despite changes in appearance, like aging, hairstyle, facial hair and expression.
For decades, building automatic systems to duplicate human face identification capability has been an attractive goal for many academic researchers and commercial companies around the world. Various attempts in the past were hampered by a lack of appropriate image acquisition means, efficient identification algorithms with required accuracy, and computation power to implement such algorithms.
In general, modern object recognition approaches can be divided into two wide categories: 2D approaches, using only image information (which can be either grayscale or color), and 3D approaches, which incorporate three-dimensional information as well.
While simpler in data acquisition (which permits real-time surveillance applications, such as face recognition from a video-taped crowd in pubic places), the 2D approach suffers from sensitivity to illumination conditions and object orientation. Since the image represents the light reflected from the object's surface at a single observation angle, different illumination conditions can result in different images, which are likely to be recognized as different objects.
The 3D approach provides face geometry information, which is typically independent of viewpoint and lighting conditions, and as such is complementary to the two-dimensional image. Three-dimensional information carries the actual geometry of the surface of the object, including depth information which allows easy segmentation of the surface from the background. The fundamental question in the 3D approach is how to efficiently yet accurately quantify the similarity between a given reference surface (a model), and some other surface (a probe), which is potentially a deformed version of the model.
Typically, such quantification is achieved by calculating the so-called “distance” between the model and the probe. It is desired, typically, to capture the distinction of the intrinsic properties of the model and probe (which are associated with their metric structure), while ignoring the extrinsic properties that describe the way the surfaces deform. A deformation that preserves the intrinsic structure of the surface is referred to as “an isometry”. In Euclidian spaces, for example, translation, rotation and reflection of a body are isometrics, because they are not associated with the structure of the body. In non-Euclidian spaces, isometrics may also include bending of surfaces.
U.S. Pat. No. 6,947,579, the contents of which are hereby incorporated by reference discloses a three-dimensional face recognition technique in which three-dimensional topographical data of a geometric body in a form of a triangulated manifold is converted into a series of geodesic distances between pairs of points of the manifold. A bending invariant representation of the geometric body is then provided by forming a low dimensional Euclidean representation of the geodesic distances. The bending invariant representation is suitable for matching the geometric body with other geometric bodies. The matching is performed by calculating the distance between the respective geometric bodies, based on the bending invariant representations thereof. The presence or absence of a match is determined by thresholding the calculated distance.
The above approach is based on known algorithms such as multidimensional scaling (MDS), dimensionality reduction and texture mapping (see, e.g., A. Elad and R. Kimmel, “On bending invariant signatures for surfaces,” IEEE Trans. PAMI, 2003, 25(10):1285-1295; S. T. Roweis and L. K. Saul, “Nonlinear Dimensionality Reduction by Locally Linear Embedding,” Science, 2000, 290(5500):2323-2326; and G. Zigelman and R. Kimmel and N. Kiryati, “Texture mapping using surface flattening via multi-dimensional scaling,” IEEE Trans. Visualization and computer graphics, 2002, 9(2):198-207).
Also known are methods in which graph theory is employed in the context of representation of discrete metric spaces [N. Linial and E. London and Y. Rabinovich, “The geometry of graphs and some its algorithmic applications,” Combinatorica, 1995, 15(2):333-344]. A graph is embedded in Euclidean space such that the distances between nodes in the graph are close to the geometric distance of the vectors representing the Euclidean space.
However, when dealing with abstract metric spaces like graphs, or even points clouds with local distances, smoothness of the underlying geometry can not be assumed, and the embedding is a much harder problem. One of the caveats of the Euclidean embedding approach is the fact that such approach introduces a metric distortion, because a Riemannian surface cannot be perfectly represented in a finite-dimensional Euclidean space. One of the simplest examples is the K1,3 graph which cannot be embedded in Euclidean space without distortion.
There is thus a widely recognized need for, and it would be highly advantageous to have, a method and apparatus for determining similarity between surfaces devoid of the above limitations.