1. Field of the Invention
The present invention relates to a pattern recognition method, a pattern recognition apparatus, and a computer-readable storage medium storing a program enabling a computer to execute a pattern recognition method. The pattern recognition generally includes image recognition and speech recognition.
2. Description of the Related Art
A conventional pattern recognition method directed to pattern recognition (e.g., image recognition and speech recognition) tends to decrease the processing speed if higher identification accuracy (recognition accuracy) is necessary and tends to deteriorate the identification accuracy if a higher processing speed is required.
To satisfy requirements in processing speed and identification accuracy, a conventional pattern recognition method includes connecting a first identifier improved in processing speed and a second identifier enhanced in identification accuracy (See “Robust Face Detection System Based on Convolutional Neural Networks Using Selective Activation of Modules” by Yusuke Mitarai, Katsuhiko Mori, and Masakazu Matsugu, Second Forum on Information Technology, 2003) According to such a conventional pattern recognition method, the first identifier speedily detects candidate regions and the second identifier accurately evaluates the candidate regions.
However, many of identifiers, which are usable as the above-described first or second identifier, generate a multi-valued output referred to as “certainty”, as an identification result. For example, Japanese Patent Application Laid-Open No. 2001-309225 discusses a conventional method includes binarizing a multi-valued output referred to as “certainty” with a threshold and determining the presence of any pattern.
Two or more identifiers (discriminant functions) are commonly used to classify input information into a plurality of groups and identify a group (identifier) having a highest output value. For example, an identifier referred to as “Perceptron” selects a linear function that maximizes a linear sum of input information and obtains a classification corresponding to the selected linear function as an identification result. As discussed in “Rotation Invariant Neural Network-Based Face Detection” by Henry A. Rowley, Shumeet Baluja, and Takeo Kanade, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1998, by calculating a linear sum using a weighting factor corresponding to an output value of each identifier, a unique classification can be obtained.
According to a conventional method using a plurality of identifiers and binarizing a certainty (output) of each identifier with a fixed threshold, selecting an appropriate threshold may be difficult because an output range of each identifier is variable depending on acquisition conditions of input information. For example, when input information is image data, if shooting conditions of an image relating to the input image data are inappropriate, each identifier may not be able to detect a face and may generate a weak output.
If a threshold used in such conditions is excessively high, the identifier cannot identify a face involved in an image as a candidate due to a weak output value. On the contrary, if the threshold is excessively low, the second identifier detects so many candidates that may decrease the processing speed. Namely, using a fixed threshold may fail to identify a face in an image in various shooting conditions.
According to a method for selecting only one candidate (which has a maximum output value) among outputs of a plurality of identifiers, the first identifier having lower identification accuracy may not be able to detect a correct candidate. If the identification accuracy of the first identifier is lower, the output of an identifier corresponding to a correct candidate may not always take a maximum value. Furthermore, the method cannot identify two or more correct candidates simultaneously, because the method leaves only one candidate.
Moreover, a complex identifier includes a plurality of identifiers. If the complex identifier employs serial and parallel arrangements to connect the identifiers, it is desired to constitute a plurality of groups each including serially connected identifiers and connect the groups in parallel to reduce a required memory capacity. However, according to such an arrangement, a rear-stage identifier constituting a serial identifier group tends to perform useless processing. The processing time tends to be longer and the identification tends to end unsuccessfully, because each serial identifier group performs processing independently. In fact, execution of a rear-stage identifier in one identifier group is dependent on the execution of a rear-stage identifier in another identifier group.