1. Field of the Invention
The present invention relates to a learning method and device for pattern recognition, in particular, ones suitable for training to give a function of detecting and recognizing from image data faces, people, vehicles and other objects.
2. Related Background Art
In the field of image recognition, methods have been proposed to learn a feature amount necessary for detection of a subject to be recognized, and an example thereof is disclosed in M. Weber et al., “Viewpoint-Invariant Learning and Detection of Human Heads, Proceedings of fourth International Conference on Automatic Face and Gesture Recognition”, 2000, p. 20-27. The technique disclosed in this document runs a so-called interest operator on an image to extract local feature points such as corners and line intersections, and then applies clustering by vector quantization such as the k-means method to extract a few, useful features.
The techniques disclosed in Sirovich et al., “Low-dimensional procedure for the characterization of human faces”, J. Opt. Soc. Am. [A], 1987 vol. 3, p. 519-524 and Lades et al., “Distortion Invariant Object Recognition in the Dynamic Link Architecture”, IEEE Trans. on Computers, 1993, vol. 42, p. 300-311 present examples of how to recognizes an image. The technique according to the former document recognizes an image by calculating a feature amount on the similarity with a model. Specifically, an input pattern is mapped onto a unique image function space, which is obtained through analysis on major components of a model image of a subject, to calculate the distance from the model in a feature space. The technique according to the latter document graphs results of feature extraction (feature vectors) as well as their spatial arrangement in relation to one another, and calculates the similarity through elastic graph matching to recognize an image.
Examples of a pattern recognition method using a neural network model which is inspired by the brain's mechanism of processing information include ones that involve hierarchical template matching (see, for example, M. Weber et al., “Viewpoint-Invariant Learning and Detection of Human Heads, Proceedings of 4th International Conference on Automatic Face and Gesture Recognition”, 2000, p. 20-27 and Fukushima & Miyake, “Neocognitron: A new algorithm for pattern recognition tolerant of deformation and shift in position”, Pattern Recognition, 1982, vol. 15, p. 455-469), ones that employ multilayer perceptron, and ones that use a radial basis function network.
The learning method according to M. Weber et al., “Viewpoint-Invariant Learning and Detection of Human Heads, Proceedings of 4th International Conference on Automatic Face and Gesture Recognition”, 2000, p. 20-27 has a problem in that the extracted features, which are effective for specific subjects, may not be so for detection and recognition of subjects in other categories.
Also, none of the recognition algorithms according to Sirovich et al., “Low-dimensional procedure for the characterization of human faces”, J. Opt. Soc. Am. [A], 1987 vol. 3, p. 519-524; Lades et al., “Distortion Invariant Object Recognition in the Dynamic Link Architecture”, IEEE Trans. on Computers, 1993, vol. 42, p. 300-311; Fukushima & Miyake, “Neocognitron: A new algorithm for pattern recognition tolerant of deformation and shift in position”, Pattern Recognition, 1982, vol. 15, p. 455-469; JP 60-712 B have quite succeeded in meeting demands that it be robust to a change in size, direction or the like of a subject to be recognized and that it be applicable to detection and recognition of subjects which are in different categories.