A statistical pattern identification method is generally known as a method for identifying an object from an image.
The statistical pattern identification method is a method with which a relation between input and output is estimated based on several input-output pairs. More particularly, the statistical pattern identification method is a method in which a desirable input-output relation is learnt from a large number of input data and output data and this relation is used for the identification.
Therefore, the statistical pattern identification method is realized by mainly, a learning process and an identification process.
The learning process is a process for acquiring a learning parameter used for the identification process by using sample data for learning and teacher data thereof. The teacher data is data indicating a correct identification result (output) to the sample data (input).
Specifically, the teacher data is either an arithmetic expression or the learning parameter for deriving an output value from an input value (for example, the input value and the output value corresponding to the input value), or both of them.
In other words, as mentioned above, the learning process is a calculation process whose purpose is to acquire the learning parameter for calculating data to be outputted to an arbitrary input inputted at an identification process stage.
For example, in case of the learning process using a multi-layer perceptron which is one kind of neural networks, “a connection weight” between respective nodes is acquired as the learning parameter.
On the other hand, the identification process is a process for calculating an output (an identification result) by using the learning parameter to the inputted arbitrary data (the identification target).
Generally, in order to improve accuracy of the identification process, a complicated feature extraction process is performed to each of a large number of learning target patterns. For example, when character recognition is performed, a slope, a width, a curvature, the number of loops, and the like of a line of a certain character corresponding to the learning target pattern are extracted as a feature amount. For this reason, in other words, the feature extraction process is a process for creating another pattern from an original pattern.
Here, a general feature extraction process will be described with reference to FIG. 14.
FIG. 14 is a block diagram showing a general configuration for performing the identification process based on a statistical pattern identification method.
First, as shown in FIG. 14, after the arbitrary identification target pattern is inputted, pre-processing means A1 perform a preliminary process (noise elimination and normalization) which allows a subsequent process to be easily performed.
Next, the feature extraction means A2 extract feature amounts (a numerical value and a symbol) which provide pattern-specific behavior from the identification target pattern to which the preliminary process has been performed.
For example, when d feature amounts are extracted, the feature amount can be represented by a feature vector as expressed by the following equation.X=(x1,x2, . . . ,xd)
Identification calculation means A3 input the feature amount extracted by the feature extraction means A2 and determine “classification/category/class” of the identification target.
Specifically, the identification calculation means A3 performs the calculation to determine whether or not the extracted feature amount is a specific target based on a calculation method specified by the learning parameter stored in a dictionary storage unit A4 in advance.
For example, the identification calculation means A3 determine that the extracted feature amount is the specific target if the calculation result is “1” and it is not the specific target if the calculation result is “0”. Further, the identification calculation means A3 can determine whether or not the extracted feature amount is the specific target based on whether or not the calculation result is lower than a predetermined threshold value.
Here, in the following related technology, in order to maintain high identification accuracy, it has been required that the number of dimensions of the feature vector xd is equal to or greater than a predetermined value. Further, not only such method but also other various methods have been used.
For example, in the method proposed by non-patent document 1, a rectangle feature is extracted from the target pattern.
In the method proposed by patent document 1, a directional pattern indicating a distribution of a directional component on a character pattern is created. A vertical direction component, a horizontal direction component, and a diagonal direction component based on this directional pattern are extracted as a directional feature pattern for character recognition. In other words, in this method, the read character is reproduced by combining these directional components.
In the technology proposed by patent document 2, with respect to a plurality of measuring points on an image, a near region of narrow width fixed shape for measurement is provided at both wings of a search line passing through each measuring point. By this technology, a direction of a luminance gradient vector of the image is measured at a plurality of near points in this region. In this technology, a degree of concentration is calculated at each near point from the difference between the direction of the vector and the direction of the search line, a degree of line concentration to the measuring point is calculated from all degrees of concentration, and when it becomes maximal, it is determined that line information exists along the direction of the search line. In this technology, a basic additional value that can be used at each measuring point is calculated based on the degree of concentration in advance. In this technology, when the direction of the vector is measured, one of the basic additional values is selected and added for each direction of the search line and whereby, the degree of line concentration at each measuring point is calculated, and when this value becomes maximal, the line information is expected.
In the technology proposed by patent document 3, with respect to the feature amount of a certain object, by replacing the feature amount component corresponding to a background region with another value, a plurality of feature amount data whose backgrounds are different are generated from one object image and the identification parameter is learned.