Conventionally, in the field of image processing, there is well known a technology of detecting (extracting) an image region to which a human is expected to pay attention in the image or a noteworthy image region (hereinafter each image region is referred to as a salient region) from the image. Using a salient region detecting technology, a saliency measure of each pixel in the image is calculated, and a saliency map indicating the saliency measure of each pixel in the image is also produced.
For example, the salient region detecting technology can be used to detect a main subject from the image.
A learning-based algorithm is used as an algorithm to detect the salient region. For example, in Japanese Unexamined Patent Publication No. 2001-236508, a type of a feature is previously learned and decided based on data of a plurality of images used as a learning target, and the feature of each portion in a target image data is extracted based on the decided type of feature and the target image data used as a calculation target of a saliency measure. According to the technology of Japanese Unexamined Patent Publication No. 2001-236508, a saliency measure closer to human sense can be determined by considering learning effect as a form of human experience or memory.
However, in the above learning-based algorithm, it is necessary to previously prepare a plurality of pieces of image data to obtain the learning target that can be previous knowledge for the target image data. Therefore, the saliency measure cannot be evaluated in the case where previous knowledge does not exist.
On the other hand, Japanese Unexamined Patent Publication No. 2010-258914 discloses a technology in which a salient region is detected using information between frames of a video image with no need for previous knowledge.