The present invention relates to an image processing device, an information storage device, an image processing method, and the like.
In recent years, supervised learning has been studied in the field of machine learning. The contents of an image may be discriminated (classified) using a discriminator (classifier) that is generated by performing the learning process when it is desired to detect the position of an object within an image. The above technique may also be applied when it is desired to classify an image that partially includes an object or the like that is represented by the correct answer label. For example, JP-A-2008-282267 discloses a method that classifies such an image based on the feature quantity of part of the target image.
The discrimination accuracy (classification accuracy) achieved using the generated discriminator (classifier) normally increases as the amount of training data (i.e., the number of pieces of training data) used for the learning process increases, and it becomes possible to automatically assign a correct label to unlabeled data.
However, the correct answer label is normally manually assigned to the training data. Therefore, it may be difficult to provide a large amount of training data, or the cost required to generate the training data may increase.
When implementing a learning process that generates a classifier for detecting the position of an object or the like from an image that partially includes a scene or an object that is represented by the correct answer label, it is necessary to provide the position, the shape, and the like of the object within the image as training data. However, it takes time to manually provide information about the position and the shape of the object as compared with a process that assigns an attribute class label to an image. As a result, the number of pieces of training data to be provided decreases, and the performance of the classifier (i.e., learning results) deteriorates.
Semi-supervised learning has been developed from supervised learning. Semi-supervised learning utilizes unlabeled data as the training data in addition to data to which the correct answer label is assigned. Generative learning has been proposed as one type of semi-supervised learning in which image data is mainly used as the learning-discrimination target, and a new image is generated from an image to which the correct answer label is assigned, and used for the learning process.
However, when employing the assumption that, when a new image is generated from an image to which the correct answer label is assigned, the correct answer label assigned to the new image is the same as that assigned to the original image, the image to which the correct answer label is assigned can be changed on condition that the correct answer label does not change when generating a new image, and it is impossible to generate a large number of new images. Therefore, it is impossible to sufficiently increase the number of pieces of training data, and sufficiently improve the discrimination accuracy achieved using the discriminator.
When employing the assumption that the correct answer label may be changed when a new image is generated, a new image may be generated from an image to which the correct answer label is assigned using a method that segments the original image (to which the correct answer label is assigned) into a plurality of images, manually assigns the correct answer label to the generated image group, and uses the resulting image group for the learning process as new training data. In this case, it is possible to sufficiently increase the number of pieces of training data. However, the cost required to label the training data increases to a large extent.