Field of the Invention
The present invention relates to object detecting apparatus and method which are particularly suitable for detecting objects of various orientations.
Description of the Related Art
Conventionally, a technique of detecting a physical object such as a face, a human body or the like included in an image has been proposed. In this technique, the physical object included in the image is detected generally by the following process. That is, an area for which it is decided whether or not the physical object exists is first set, and a predetermined feature quantity is extracted from the set area. Then, a discriminator previously created by machine learning decides, by referring to the extracted feature quantity, whether or not the physical object exists in the set area.
Incidentally, there are physical objects of various orientations in the image, and feature quantity patterns change according to the orientations even for the same physical object. Therefore, it is difficult, by using the same discriminator, to detect the physical objects of all the orientations with high degree of accuracy. Under the circumstances, Japanese Patent No. 5025893 proposes the method of providing a plurality of discriminators each specialized to detect a specific-orientation physical object. In this method, since the orientation to be detected by the one discriminator is limited, it is possible to detect the physical objects of various orientations with high degree of accuracy as compared with the case in which the same discriminator is used for all the orientations.
FIG. 12 is a diagram illustrating a processing flow by a boosting discriminator generally used when detecting physical objects (see P. Viola, and M. Jones, “Rapid Object Detection Using a Boosted Cascade of Simple Features”, CVPR2001, Vol. 1, pp. 511-518). Here, discriminating processes 1201 to 120N respectively correspond to first to N-th discriminating processes which are performed in order of 1 to N. In each of the discriminating processes, it is decided whether a physical object exists (TRUE) or does not exist (FALSE) by referring to a feature quantity according to a discrimination parameter previously created by machine learning.
In such an example illustrated in FIG. 12, if TRUE is decided, the flow advances to the subsequent discriminating process. Then, if the decided results of all the discriminating processes are TRUE, it is decided that the physical object exists in the relevant area. On the other hand, if any one of the decided results of the discriminating processes is FALSE, it is decided at that point that the physical object does not exist in the relevant area. As just described, the many discriminating processes are necessary to detect the physical object with high degree of accuracy. However, if the number of discriminating processes increases, the size of discrimination parameters increases.
As described above, in order to detect the physical objects of various orientations in the image with high degree of accuracy, it is desirable to respectively provide the different discriminators according to the orientations. However, if the discriminator is provided for each of detection-target orientations, the number of necessary discriminators and the size of discrimination parameters increase. For example, as illustrated in FIG. 10, if nine-classified face orientations 1001 to 1009 are provided, it is necessary to provide nine kinds of discriminators and prepare discrimination parameters for the respective discriminators.
In this connection, it has been proposed to reduce the number of discriminators and the size of discrimination parameters by commonalizing the discriminators for the orientations which are in relations of rotation and reversal. In such a method, feature quantities which are in the relations of rotation and reversal are generated, and, by using the discriminator for one orientation, the physical objects of the orientations which are in the relations of rotation and reversal in regard to the one orientation are detected. For example, in FIG. 10, the discriminators which are in the relation of rotation are commonalized, and the three kinds of discrimination parameters for, e.g., the orientations 1002, 1005 and 1008 are provided. On another front, for example, the discriminators which are in the relation of right-and-left reversal are commonalized, and the five kinds of discrimination parameters for, e.g., the orientations 1001 to 1005 are provided.
As a concrete example, the method of rotating and reversing an image and extracting feature quantities respectively from the rotated and reversed images has been proposed (see Japanese Patent No. 4628882, Japanese Patent No. 4238537, and Japanese Patent Application Laid-Open No. 2009-32022). However, in this method, since it is necessary to repetitively perform a feature quantity extracting process, processing time is prolonged.
Moreover, the method of changing a pixel position for calculating a feature quantity according to rotation and reversal has been proposed (see Japanese Patent No. 4556891, and International Publication No. WO2009/078155). In this method, it is possible to generate the feature quantities which are in the relations of rotation and reversal by changing pixel reference positions without rotating and reversing an image. However, this method is applicable only to a case where the original pixel value is referred as it is. In a case where the feature quantity in which the relation of a plurality of pixels such as LBPs (local binary patterns) is considered is extracted from the image and the feature quantity is referred, the relation of the pixels is changed by rotation and reversal. Therefore, in order to detect a physical object with high degree of accuracy, it is necessary to convert not only the reference position but also the feature quantity value.
Moreover, the method of converting a reference position of a calculated feature quantity and a value of the feature quantity according to rotation and reversal has been proposed (see Japanese Patent Application Laid-Open No. 2012-203691). In this method, the feature quantity is extracted from an image, and the feature quantities which are in the relations of rotation and reversal are generated by converting the reference position of the extracted feature quantity and the value of the feature quantity at the reference position. Here, it is necessary to provide a conversion table for converting the value of the feature quantity. In other words, since this method is premised on use of the conversion table like this, there is a problem that a cost for conversion increases when the number of bits of the feature quantity is large. For example, when the number of bits of the feature quantity is 16, it is necessary to provide the conversion table of 16×216=1 Mbit.
Moreover, the method of managing a feature quantity which is invariant with respect to rotation and reversal has been proposed (see T. Ojala, M. Pietikainen and T. Maenpaa, “Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns”, IEEE Trans. on PAMI, Vol. 24, No. 7, pp. 971-987, 2002; and Q. Tao and R. Veldhuis, “Illumination Normalization Based on Simplified Local Binary Patterns for A Face Verification System”, Biometrics Symposium, pp. 1-6, 2007). In this method, a process of converting the value of the feature quantity is unnecessary for extracting the feature quantity invariant with respect to rotation and reversal. However, information concerning an image directional characteristic is not included in the feature quantity. Therefore, detection accuracy for this feature quantity deteriorates as compared with the feature quantity which includes the information concerning the directional characteristic.
The present invention aims to be able to easily and accurately generate the feature quantity for detecting an object.