1. Field of the Invention
The present invention relates to an object recognition apparatus and an object recognition method.
2. Description of the Related Art
Functions for detecting a human face from an image being shot and for tracking the subject of shooting (the object) in a digital still camera or a camcorder have been known. Such a face detection function and a face tracking function are very useful for automatically adjusting the focus and exposure to an object being shot.
Object tracking methods through online learning have been proposed in recent years. In the online learning, an image related to an object being shot is utilized to adapt a dictionary for use in recognition processing to the recognition target. An example of the methods is described in Grabner and Bischof, “On-line Boosting and Vision,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '06) (hereinafter abbreviated as a “Grabner document”). According to this method, tracked targets are not limited to a human face, but objects such as pet animals can be set as tracked targets. In other words, this method allows expanding trackable targets.
In this method, for example, if recognizing the entire human body is desired, a processing area to be subjected to the recognition is advantageously a tall rectangle. If recognizing a car is desired, a processing area to be subjected to the recognition is advantageously a wide rectangle. Accordingly, the above method proposed in the Grabner document expects a user to specify the area of the tracked target in advance.
Also for example, to recognize an object with high accuracy, feature amounts that characterize the recognition target object need to be configured in advance. However, the recognition accuracy will not be improved by performing the learning with feature amounts that originally provide low recognition performance. Therefore, the above method proposed in the Grabner document concurrently employs Haar-like feature, orientation histogram, and LBP (Local Binary Pattern) for feature amounts for use in the object recognition and randomly uses 250 of the feature amounts for the learning.
However, to recognize various objects with high accuracy according to the above method proposed in the Grabner document, a vast number of feature amounts for various combinations of positions and sizes of local areas in the processing area to be subjected to the object recognition need to be used in the learning. Accordingly, it is difficult with the above method in the Grabner document to balance the accuracy and the processing time associated with the object recognition.