The present disclosure relates to an information processing apparatus, an information processing method, and a program, and more particularly, to an information processing apparatus, an information processing method, and a program capable of distinguishing whether a predetermined subject is shown in an input image with high performance by the smaller number of features.
For example, there is boosting or bagging as an ensemble learning for learning a classifier which performs classification by majority decision on weak hypotheses which are outputs of a plurality of weak classifiers.
For example, U.S. Pat. No. 7,099,510, Japanese Patent No. 4517633, or “Joint Haar-like Features for Face Detection” by T. Mita, T. Kaneko, and O. Hori in the proceedings of IEEE International Conference on Computer Vision 2005 (ICCV 2005) discloses a method (subject distinguishing method) of recognizing an image pattern to distinguish whether a predetermined subject is shown in an image by the use of the classifier obtainable through the ensemble learning.
In U.S. Pat. No. 7,099,510, calculation of features and determination of a subject are performed at high speed by using a sum of the luminances of regions cut from an input image, called an integral image, input from the outside as the features of an image used for a subject distinguishing process of distinguishing whether a predetermined subject is shown in an image.
In Japanese Patent No. 4517633, a difference between the pixel values of two pixels in an input image calculated by just a very simple process called subtraction is used as a feature. Therefore, the performance of the subject distinguishing is sufficiently realized using the feature calculated only by the subtraction.
In Japanese Patent No. 4517633, since the positions of two pixels used for calculating the difference which is the feature are set for each of the weak classifiers of a classifier, it is necessary to calculate the feature of only the number of weak classifiers of the classifier.
However, the difference between the pixel values, which is the feature, can be calculated by only the very simple process called subtraction. Therefore, even when the difference between the pixel values is the feature of only of the number of weak classifiers of the classifier, the difference can be calculated at high speed. As a consequence, the subject distinguishing process can be performed at high speed.
In “Joint Haar-like Features for Face Detection” by T. Mita, T. Kaneko, and O. Hori in the proceedings of IEEE International Conference on Computer Vision 2005 (ICCV 2005), a plurality of Q features is each classified to one of two values and a class assignable to the combination of classification results of the two values of the respective Q features is output as one weak hypothesis (output of one weak classifier) in a table (so-called a Decision Table), in which one of two classes indicating whether a subject is shown in an input image in 2Q combinations of the two values of the Q features.
In “Joint Haar-like Features for Face Detection” by T. Mita, T. Kaneko, and O. Hori in the proceedings of IEEE International Conference on Computer Vision 2005 (ICCV 2005), accordingly, one weak hypothesis is output in regard to the plurality of features in one weak classifier. Therefore, the features can be processed more efficiently compared to a case where one weak hypothesis is output in regard to one feature in one weak classifier.