1. Technical Field
The invention is related to generating a representation of an object, and more particularly to a system and process for generating representations of objects that are substantially invariant in regard to translation, scale and optionally rotation.
2. Background Art
In recent years, computer vision and graphics research has witnessed an increasing need for the ability to compare three dimensional objects. Most of the early object recognition techniques focused on comparing the 2D images of unknown objects with stored views of known objects. Progress in 3D object model acquisition techniques such as laser range finders and real-time stereo machines led to the problem of comparing 3D object models created using range images or 3D view-independent geometry. Object comparison is a key technique in applications such as shape similarity based 3D object retrieval, matching, recognition and categorization [2, 4, 10], which will be increasingly required as 3D modeling becomes more and more popular.
Comparison between 3D objects is usually based on object representations. Typically, a 3D object is represented by a geometric model, appearance attributes, and/or optionally annotations. The geometric model represents the shape of a 3D object, which is the central part for object representation. These models are usually obtained via 3D sensors, Computer Aided Design (CAD), stereo or shape-from-X techniques. There are many specific representations for geometric models, such as boundary representation, voxel representation, Constructive Solid Geometry (CSG) tree, point cloud, range image and implicit functions. The aforementioned appearance attributes include color, texture and Bidirectional Reflectance Distribution Functions (BRDFs), which are of interest for image synthesis in computer graphics and rendering based vision applications. As for annotations, these include other attributes describing an object at a semantic level and provide an efficient and effective way to retrieve objects from a 3D database. For example, a car model can be easily retrieved using the keyword “car”, if such an annotation is provided a priori. However, it is not reasonable to assume that all objects in the database have such annotations, since some objects in a 3D database may not have been annotated when they were created, and it is extremely difficult to automatically annotate 3D objects. In addition, manual labeling is very laborious if the database is large.
The appearance of a 3D object provides a large amount of information for human perception, but it is very difficult to incorporate appearance in object comparison techniques. The current research on comparing 3D objects is focused on comparing their shapes. However, the geometric model for the shape in current 3D object representation schemes is usually developed for specific tasks such as modeling, editing and rendering, and is not well suited for comparison purposes. Firstly, there are many types of geometric representations, and it is difficult to compare the geometric models created with different representations without some form of conversion. Second, the geometric representation for the 3D shape is usually not invariant to scaling or rigid transformations. For example, the same shape may be represented differently in two coordinated systems. Therefore, a shape descriptor is usually extracted from the geometry model, and used for object comparison. Ideally, these descriptors should be scale and rigid transform invariant, capable of good discriminability, robust to noise, and independent of specific geometric representations. Current descriptors do not completely achieve these goals.
Previous work related to shape similarity can be found mainly in three research areas: (1) object recognition and classification, (2) surface matching and alignment, and (3) 3D shape comparison and shape similarity based object retrieval. The task of object recognition and classification is to determine whether a shape is a known object and to find k representative objects in an object data set. Existing object recognition approaches are typically based on analyzing 2D images of an object captured at different viewpoints. The task of surface matching and alignment is to find overlapping regions between two 3D objects. The representative work in this area includes range image based approaches, ICP (Iterative Closest Point) based approaches, spin images, geometric hashing and structural indexing. The aforementioned 3D shape comparison approach is related to surface matching, but it focuses on comparing the object's global shape, while surface matching compares only part of the object's shape. By building a map from the 3D shape onto a sphere, some approaches generate spherical representations for the shapes, and then compare them to a database of spherical representations. Since the map from the shape to the sphere is independent of translation and scaling, comparison between two 3D objects can be accomplished by finding the rotation that minimizes the difference between their spherical representations. However, there are issues with occlusion, and these representations require explicit orientation alignment.
View-based approaches [3] use 2D views of a 3D object for object recognition. Given an unknown object, views at random angles are generated, and matched against the prototypical views in a database. The best match gives the identity of an unknown object and optionally its pose. However, such techniques tend to require large databases and memory footprints, and recognition rates tend to be slow.
There is a movement towards placing more emphasis on fast recognition rates, due to the potential of a 3D search engine. This requires shape representations that are not only fast to extract, but efficient to compare against similar representations of other objects. Examples include the multiresolutional Reeb graph (MRG) [7], shape distribution [11], shape histogram [1], ray-based descriptors [15, 14], groups of features [12], aspect graph [3], parameterized statistics [10], and 3D FFT based descriptors [13].
The representation of MRG [7] provides a fully automatic similarity estimation of 3D shapes by matching the topology. The topology information is analyzed based on the integrated geodesic distance, so the topology matching approach is pose invariant. However, the topology matching is difficult to accelerate, which will be a problem when retrieving objects from a large database.
Shape distribution techniques give a very simple description for 3D shape, which has advantages in 3D object retrieval since it is easy to compute and efficient to compare. Osada et al. [11] proposed the use of the D2 shape distribution, which is a histogram of distance between points on the shape surface.
Ankerst et al. [1] and Vranic et al. [15] proposed the use of feature vectors based on spherical harmonic analysis. However, their spherical functions are sensitive to the location of the shape centroid, which may change as a result of shape outliers or noise.
It is noted that in the preceding paragraphs, as well as in the remainder of this specification, the description refers to various individual publications identified by a numeric designator contained within a pair of brackets. For example, such a reference may be identified by reciting, “reference [1]” or simply “[1]”. Multiple references will be identified by a pair of brackets containing more than one designator, for example, [2, 3]. A listing of references including the publications corresponding to each designator can be found at the end of the Detailed Description section.