Object recognition describes the task of automatically recognizing objects in digital video images. It can be divided into two sub tasks: detection and identification. In detection, objects belonging to a certain class (e.g., the class of cars or faces) have to be located in a given input image. In identification, a certain object has to be recognized in an image (e.g., Jim's face or Helen's blouse).
There are three main problems in object recognition. In case of detection, objects belonging to the same class might vary in their shapes and colors (e.g., different types of chairs). In case of identification, two different objects might look very similar (e.g., in face identification, the faces of siblings might be hard to distinguish). Thirdly, the appearance of an object in an image changes with its pose, the illumination, and the camera. Recognition systems have to be invariant to those changes.
In addition to these inherent problems, many conventional recognition systems suffer from a number of drawbacks. For instance, such convention systems typically require a large database of training pictures, which is tedious to build. In addition, they are too slow to be used in real-time applications. In this sense, object recognition systems can be evaluated using two main criteria: accuracy and speed. The recognition accuracy is defined by how well the system can locate objects in an image relative to the number of false alarms). The speed of the system at run-time defines how much time it takes to process a new image.
What is needed, therefore, are object recognition techniques that provide both accuracy and speed (e.g., without requiring a large database of training images and enabling real-time processing).