1. Field of the Invention
The present invention relates to an image recognition apparatus configured to recognize a predetermined pattern, a processing method thereof, and a computer-readable storage medium.
2. Description of the Related Art
Technology (image recognition technology) for recognizing a predetermined pattern (for example, an object) within image data is known. For example, in a digital camera, exposure and focus are set to the region of an object that was recognized using this technology. Also, for example, in a personal computer device, image recognition processing is performed, an image is automatically classified, and the image is effectively edited and corrected (see Viola and Jones, “Rapid Object Detection using Boosted Cascade of Simple Features”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2001)(referred to below as Document 1), and Navneet Dalal and Bill Triggs, “Histograms of Oriented Gradients for Human Detection”, IEEE Computer Vision and Pattern Recognition, Vol. 1, pp. 886-893, 2005 (referred to below as Document 2).
With this sort of technology, a plurality of learning images, namely positive images that are correct patterns and negative images that are incorrect patterns, are prepared, and by performing machine learning based on image features that are useful for discriminating these patterns, a dictionary for recognizing correct patterns is generated.
Factors affecting recognition accuracy include the image features used for pattern discrimination and the learning images used for machine learning. Image features that are useful depending on the recognition target have been studied. For example, it is known that Haar-like features are useful image features if the recognition target is a face, and HOG (Histograms of Oriented Gradients) features are useful image features if the recognition target is a human body.
As for the learning images, there have been attempts to improve accuracy by increasing the number and type of positive images and negative images. Also, when known in advance that patterns are difficult to detect or will be mistakenly detected, accuracy for specific patterns has been improved by emphasizing learning of images for those patterns.
On the other hand, as an application of such recognition technology, in Grabner and Bischof, “On-line Boosting and Vision”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2006)(referred to below as Document 3), technology is disclosed whereby images (learning images) for machine learning are automatically collected in a device, and incremental learning of those learning images is performed. Thus, an original dictionary is updated within the device, thereby realizing an improvement in dictionary accuracy.
In this sort of incremental learning, learning images are automatically collected within the device. As described in Document 3, a method is known in which a desired object is tracked relative to continuous image frames, and positive images are automatically collected. In this method, because learning images that include variations in the direction, size, and so forth of the desired object can be effectively collected, each time that incremental learning is performed, an image pattern that heretofore could not be detected gradually becomes detectable.
On the other hand, for negative images, a method is conceivable in which merely images other than positive images are collected. However, the negative images that can be collected by this method are only patterns that are not positive, and in particular, it is not possible to concentratedly collect only patterns that are similar to positive but are not positive. Therefore, even if incremental learning has been performed, the problem may occur that a pattern similar to the desired object but actually not a correct pattern is mistakenly detected.