Field of the Invention
The present invention relates to a recognition apparatus and a recognition method, and, more particularly, to a technique suitable for use in estimating attribute information (for example, a scene, an event, composition, or a main subject) of an input image such as a still image, a moving image, or a distance image on the basis of an object in the input image.
Description of the Related Art
Examples of a known method of estimating the scene or event of an image on the basis of an object in the image include Li-Jia Li, Hao Su, Yongwhan Lim, Li Fei-Fei, “Objects as Attributes for Scene Classification”, Proc. of the European Conf. on Computer Vision (ECCV 2010) (Non-Patent Document 1). Referring to Non-Patent Document 1, it is determined whether an image includes objects of a plurality of particular classes, the distribution of results of the determination is used as a feature value, and the scene of the image is determined on the basis of the feature value.
In this exemplary method, it is necessary to prepare a plurality of detectors for recognizing a subject serving as a clue to scene determination (for example, detectors using the method disclosed in P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan. “Object Detection with Discriminatively Trained Part Based Models”, IEEE Trans. on Pattern Analysis and Machine Intelligence 2010) (Non-Patent Document 2) and perform detection processing for each particular subject such as a dog or a car. At that time, the following difficulties arise.
First, in order to accurately determine the types of many scenes, it is necessary to prepare detectors for many subjects associated with each of these scenes. In the detection processing disclosed in Non-Patent Document 2, since each detector performs image scanning called sliding window, the amount of computation becomes large. A processing time taken for scene determination may markedly increase with the increase in the number of scenes.
Second, in the scene determination, it is not generally known which of subjects is important. Therefore, it is difficult to determine in advance which type of subject detector is prepared when discriminating between slightly different scenes.
And third, for example, when discriminating between the scene of a birthday party and the scene of a wedding, the difference between clothes a person wears (for example, the difference between informal clothes and a dress) can be used for the discrimination. Thus, in some cases, the presence or absence of a subject is not important and the difference between variations of a subject is important. However, in the case of the method in the related art disclosed in Non-Patent Document 1, it is difficult to use the difference between variations of a subject for discrimination.
The present invention provides a recognition apparatus capable of determining the scene of an image on the basis of various subjects included in the image with a low processing load.