Field of the Invention
The present invention relates to a technique for dividing an image into regions according to predefined classes.
Description of the Related Art
Conventionally, there is known a process in which an image is divided into a plurality of small regions and then classes relating to the classification of objects are identified for post processing such as image scene recognition and image quality correction suitable for the objects. In a method discussed in (R. Socher, “Parsing Natural Scenes and Natural Language with Recursive Neural Networks”, International Conference on Machine Learning, 2011.), first, an input image is divided into small regions called superpixels (SPs) based on color information and texture information. Then, a class of each divided small region is identified using a classifier called recursive neural networks (RNNs).
However, performing the identification based only on feature amounts of the small regions sometimes leads to false detection despite high reliability (high identification score, high identification likelihood). A technique is known in which a similar image is selected using global feature amounts of an image and then a class of each region of an identification target image is estimated based on class information about each region in the similar image. In (J. Tighe, “SuperParsing: Scalable Nonparametric Image Parsing with Superpixels”, European Conference on Computer Vision, 2010.), selecting a similar image based on global feature amounts of an identification target image and then determining a class of each small region of the identification target image by use of the selected similar image is discussed.
However, when a search for a similar image is performed based only on global feature amounts of an image as in the method discussed in (J. Tighe, “SuperParsing: Scalable Nonparametric Image Parsing with Superpixels”, European Conference on Computer Vision, 2010.), a specific region of an identification target sometimes cannot be extracted accurately. For example, in a case where a skin region of a black person in a beach scene image is to be extracted, if a search for a similar image is performed based only on global feature amounts of the image, an image of a beach is selected as a similar image. In such a case, it is not possible to accurately extract a specific region (skin region) of an identification target (human body), compared to a case where an image of a black person has been selected as a similar image.