Field of the Invention
The present invention generally relates to image recognition, and more specifically it relates to technology for segmenting an input image into regions belonging to predetermined categories.
Description of Related Art
Research on image recognition technology for recognizing a target object from an input image is being carried out with increasing interest. Examples of the image recognition include, among others, face recognition of determining the position of a face which is present in an image, human-body detection of detecting a human body within an image, and scene recognition of recognizing the environment or condition in which an image is captured. In image recognition, a file format, such as Joint Photographic coding Experts Group (JPEG) or bit map (BMP), which is typically used for an input image is designed in terms of storage space (size of image) and viewing details (amount of information). Therefore, a red, green, blue (RGB) image which is read from data in a conventional file format fails to provide information sufficient for highly accurate image recognition. Therefore, a proposal that information obtained upon capture of an image is to be used for image recognition has been also made.
In Japanese Patent Application Laid-Open No. 2010-220197, a technique is described in which a shadow region is discriminated from a non-shadow region to obtain an adequate white balance (WB) coefficient for each region in an image which is captured in fine weather and in which blue fogging may occur. In Japanese Patent Application Laid-Open No. 2010-220197, parameters used upon capture of an image are used to obtain a photometric value, to correct the brightness values of pixels by using the photometric value, to generate a photometric-value mapping for the pixels, to determine whether the scene of the image is an indoor scene or an outdoor scene, and to determine shadow regions when the scene is an outdoor scene. Examples of the parameters used upon capture of an image include exposure time, sensitivity, and numerical aperture (F-number value), among others.
In Japanese Patent Application Laid-Open No. 2008-86021, a technique is described in which each of the frequency, the signal-to-noise (S/N) ratio, the compression scheme, the brightness, and the like of an input image signal is subjected to clustering, and in which coefficients corresponding to class classification results are used to perform image processing.
Image recognition processing in which an input image is segmented into regions belonging to predetermined categories is known. This type of processing is called “semantic segmentation”. Image recognition processing using semantic segmentation may be applied to image correction, scene interpretation, and the like which produce results suitable for objects to be recognized.
However, for image recognition processing using semantic segmentation, a configuration in which regions belonging to categories are accurately recognized by using subsidiary information obtained upon capture of an image has not been proposed nor effectively implemented.