1. Field of the Invention
The present invention relates generally to the field of computer vision and, in particular, to recognizing instances of visual classes.
2. Description of the Prior Art
Class recognition is concerned with recognizing class instances in a scene. As used in this context, “class” is a collection of objects that share common visual characteristics and differ from objects in other classes by visual characteristics.
The first step in class recognition is to build a database of known classes. The second step in class recognition is to match new instances observed in images with their classes represented in the database.
Class recognition presents many problems. First, it presents the problems of recognizing a specific object. An object may appear very differently when viewed from a different perspective, in a different context, or under different lighting conditions. In addition to the problems of object recognition, class recognition presents additional problems related to within-class variation. The instances of a class may vary in certain aspects of their shape or their visual appearance. A class recognizer must be able to deal with this additional variability and detect class membership based on the common characteristics of the class.
Previously, there has been no entirely satisfactory solution to these problems. Substantial research has been devoted to class recognition, but there are none that can recognize instances of a wide variety classes from a wide variety of viewpoints and distances.
Prior Academic Research
Substantial research has been devoted to the simpler problem of object recognition, but there are no object recognition systems that can recognize a wide variety of objects from a wide variety of viewpoints and distances. Class recognition is a significantly more difficult problem, of which object recognition is a subset. An object recognition system need only identify the specific objects for which it has been designed. In contrast, a class recognition system must be able to identify previously unseen objects as class instances on the basis of similarity to the common characteristics of a class.
One line of research in class recognition represents a class as an unordered set of parts. Each part is represented by a model for the local appearance of that part, generalized over all instances of the class. The spatial relationship of the parts is ignored; only appearance information is used. One paper taking this approach is Dorko and Schmid, “Selection of Scale-Invariant Parts for Object Class Recognition”, International Conference on Computer Vision, 2003, pp. 634-640. A later paper by the same authors, expanding on this approach, is “Object Class Recognition Using Discriminative Local Features,” IEEE Transactions on Pattern Analysis and Machine Intelligence, also available as Technical Report RR-5497, INRIA-Rhone-Alpes-February 2005. There are several difficulties with this general approach. The most important is that, since the geometric relationship of the parts is not represented, considerable important information is lost. A collection of parts jumbled into random locations can be confused with an object in which these parts are in appropriate locations.
Another line of research in class recognition represents a class as a constellation of parts with 2D structure. Two papers applying this approach are Burl et al., “A probabilistic approach to object recognition using local photometry and global geometry”, Proc. European Conference on Computer Vision (ECCV) 1998, pp 628-641, and Fergus et al., “Object Class Recognition by Unsupervised Scale-Invariant Learning”, Computer Vision and Pattern Recognition, 2003, pp 264-271. Another paper along these lines is Helmer and Lowe, “Object Class Recognition with Many Local Features”, IEEE Computer Vision and Pattern Recognition Workshops, 2004 (CVPRW'04), pp. 187 ff. There are two difficulties with using two-dimensional models of this kind. First, the local appearance of parts is not invariant to changes in object pose relative to the camera. Second, the relationship of the parts is acquired and modeled only as the parts occur in 2D images; the underlying 3D spatial relationship is not observed, computed, nor modeled.
Other Prior Work
U.S. patent application Ser. No. 11/159,660, filed Jun. 22, 2005, by the present inventors, entitled “System and Method for 3D Object Recognition Using Range and Intensity,” describes techniques for addressing these problems. It discloses a system and method for recognizing objects and instances of classes when the database of models and the scene are both acquired using combined range and image intensities. That is, both the models and the acquired images are three-dimensional.
Image acquisition using combined range and intensities requires special apparatus, e.g., a stereo system or a combined camera and laser range finder. In most cases, the database of class models can be built in this way, because database construction can be done under controlled conditions. However, there are many circumstances where range information cannot be obtained in the recognition step. That is, the intensity information can be readily obtained, e.g., with a simple camera, but it is difficult to acquire high-resolution range information for the scene.
Hence, there is a need for a system and method able to perform class recognition in 2D images that have only image intensity information using 3D models of geometry and appearance. Additionally, there is a need for an object recognition system and method that overcomes the limitations of prior techniques.