Field of the Invention
The present invention relates to an image process of setting image process parameters to be used in an image recognition process of detecting an object from an input image.
Description of the Related Art
Various kinds of research and development have been carried out concerning image recognition of detecting a detection target object from an image obtained by capturing an object. The image recognition technology is applied to various fields and used for many actual problems of, for example, face recognition in a photograph and part recognition in a factory.
Such image recognition can be considered from the viewpoint of pattern recognition. In the pattern recognition as well, research has been conducted on classifiers, that is, how to classify input information. There have been proposed various methods such as a neural network, Support Vector Machine (SVM), and Randomized Trees (RT).
The performance of these classifiers greatly depends on the method of extracting information from an image, and the extraction method includes various kinds of image processes. There are, for example, noise removal of removing unnecessary information from an image, gamma correction of adjusting the luminance value, and edge detection of obtaining edge information as an image feature. In addition, various image feature extraction methods are known, including extracting a feature in a predetermined region of an image as a vector. Examples are Histograms Of Oriented Gradients (HOG) and Scale-Invariant Feature Transform (SIFT).
In these image processes executed when extracting information from an image, various parameters exist depending on the method. For example, there exist parameters such as a variance in a Gaussian filter used in noise elimination, and a cell size and a cell count in HOG. The set values of these parameters largely affect information obtained from an image.
In the above-described classifiers, the optimum values of the image process parameters used to obtain optimum performance change depending on the target object, environmental conditions, and the like. In many case, the parameters are set by the user's experience or trial and error. For this reason, setting the image process parameters in image recognition puts a heavy load on the user. To prevent this, methods of easily setting the parameters have been proposed.
For example, there has been proposed a first method of automatically determining optimum parameters in machine vision tools. According to the first method, an image used for parameter adjustment is captured first. Marking of an object is performed using a bounding box or the like, thereby giving ground truth (accuracy information) for the object. The machine vision tools are sequentially executed while changing the parameter combination. The detection result obtained by the current parameter combination is compared with the ground truth. The comparison result is compared with that in the preceding parameter combination. The parameter combination that yields the better comparison result is left, and finally, best parameters are left.
There has also been proposed a second method of optimizing image process parameters used to convert a photographed image when detecting a target object. According to the second method, if a result of a detection process for an image obtained by performing an image process of a photographed image in accordance with image process parameters indicates a detection error, the image process parameters are changed, thereby determining the image process parameters.
In the first method, however, the operation of giving ground truth is complicated for the user. In addition, ground truth subjectively set by the user is not necessarily accurate. If the ground truth given by the user includes a large error, the result of comparison between the detection result and the ground truth cannot have an accurate value, and the reliability of the determined parameters is low.
In an image recognition system that generates a classifier by learning using a training image, an image process of the training image is also necessary. If the training image and input image from which an object should be detected are images obtained under different conditions (for example, if the training image is a computer graphics (CG) image, whereas the input image is a photographed image), the optimum image process parameter for the training image and that for the input image are not always the same. In this case, applying, to the training image, an image process parameter obtained by applying the second method to the photographed image is not appropriate.