In many situations it is desirable to be able to identify a three-dimensional (3D) multifeatured object automatically from a set of candidate objects, particularly when only a partial representation of the target object is available. In a typical situation, only one or more two-dimensional (2D) source images of the 3D object may be available, perhaps photographs taken from different viewpoints. Conventional methods of identifying a 3D object using 2D images as input are inherently vulnerable to changes in lighting conditions and varying orientations of the object. For example, in the case where the multifeatured object is a face, existing methods generally use 2D facial photographs as source input. Such photographs will be greatly affected by variations in lighting conditions and viewpoint, yet traditional methods have no way of taking changing lighting or viewpoints into consideration—they simply analyze the 2D image as is. If the source object is not oriented head-on, the efficacy of most methods decreases; the further out of plane the object is, the less reliable the identification becomes.
Accordingly, identification of a 3D multifeatured object from a 2D image can give good results in controlled conditions in which one or more reference images of the object can be taken in advance from the same viewpoints and under the same lighting conditions which prevail when the source image(s) to be used for identification are taken. This situation rarely occurs in practice, however, since the object to be identified may not be available or cooperative, and it is often impossible to predict the orientation and lighting conditions under which the source image(s) will be captured. For example, in the case of face recognition, the source image is often taken by a surveillance camera which may capture a side view, or a view from above. Typically the reference image will be a head-on view, which may be difficult to identify with the source image.
To cope with varying viewpoints, some identification methods capture and store images of the object taken from multiple viewing angles. However, this process is slow and costly, and it would be impractical to capture images corresponding to the possible range of angles and lighting. Another approach is to capture a 3D image of the object by using a 3D imaging device or scanner, and then to electronically generate a number of reference 2D images corresponding to different viewpoints and lighting conditions. This technique is also computationally burdensome and still does not enable the source image(s) to be matched to the continuum of possible rotations and translations of the source 3D object. In another variation, a 3D model of the target object may be created with a generalized model of the type of 3D object which is to be identified. The model may be parameterized, with parameters chosen to make the model correspond to the source 2D imagery. This 3D model may then be used to generate multiple reference 2D images corresponding to different viewpoints and lighting conditions. Since such 3D models typically have only a few degrees of freedom, however, the 3D model will usually not correspond closely to the 3D geometry of the target object, causing an inherent limitation to the accuracy of this approach.
Another challenge faced by object identification systems is to locate the object to be recognized from within a large, cluttered field in an efficient manner. Traditional methods are not hierarchical in their approach, but instead apply computationally intensive matching methods which attempt to match source images with reference images. Such methods are not suitable for rapid object detection and identification.
Accordingly, there exists a need for an automated approach that efficiently locates and identifies a 3D object from source 2D imagery in a manner that is robust under varying lighting conditions and source viewpoints.