1. Technical Field
The invention relates to a pattern recognition device or classifier. Image processing systems often contain pattern recognition devices (classifiers).
2. Description of the Prior Art
Pattern recognition methods are particularly important in automatic control engineering and in machine text processing, for instance, in optical character recognition (OCR) readers of automatic letter distribution systems or the analysis of forms. For example, the text characters on an envelope can be located, parameterized, and classified by the system. Such a system may also have the capability of sorting the mail based on the results. Therefore there is significant financial incentive for improved classification techniques.
Recognition systems making use of pattern recognition devices are known. The key point is that a “feature vector” is formed to represent whatever is to be “classified” or recognized as associated with one of a plurality of output classes by the pattern recognition device. The system then takes whatever subsequent action is implied by the results. In postal processing systems the subsequent action is typically a revenue computation or mail sortation.
Using the example of a mail-piece indicia recognition system, the process begins with the image capture via a camera. The image is then preprocessed to locate the stamp(s), remove any rotation, and down-scale the identified section(s) of the image. Feature extraction converts each sub-image candidate into a vector of numerical measurements. Thus, the feature vector represents the image in a compact form. This vector is classified to produce a stamp ID (output classification, or class ID) using a pattern recognition device. Finally, the stamp ID is post-processed into the revenue present on the mail piece, and the system can use this to tally the total revenue processed in a given run (or day).
A preprocessing stage operates on the full image to enhance or change the image representation, and produce image segmentation. Image representation enhancement often includes binarization and filtering. Image segmentation identifies each candidate object within the larger image for subsequent recognition analysis. For example, to recognize the stamps on an envelope, the stamps are located and then recognized one at a time. Similarly, the characters of the address are recognized one at a time.
Feature extraction is performed on each candidate object to convert it from an image segment to a vector that represents that image segment. The vector is formed from a sequence of measurements performed on the image segment. Many feature types exist and are selected based on the characteristics of the recognition problem.
A classifier relates the feature vector to the most likely output class, and determines a confidence value that the actual image is a member of the selected class. Typical systems contain a statistical or neural network classifier. These techniques convert the feature vector input to a recognition result and an associated confidence value. The confidence value provides an external ability to assess the correctness of the classification. For example, a classifier may output a value between zero and one with one representing maximum certainty.
Several factors have large effects on the type of classifier design selected. One factor is the ‘dimensionality’ of the device. This is simply related to the number of elements in the feature vector, and the number of output classes. The number of classes directly ties to the application. For text-recognition there is the alphabet with uppercase and lowercase characters and some combinations (1 with 1, etc), typically resulting in fifty-six to seventy classes. For stamp recognition there are thousands of possible stamps but only a few are popular. One stamp-recognition project requires recognition of 160 stamps. A recent presort-label recognition device required five classes.
Another large factor in classifier selection is a trade-off between the performance of their recognition and confidence outputs. Techniques that perform extremely well at the recognition task, such as Bayesian distance measurements and standard backpropagation neural networks, usually do not produce very meaningful output confidence values. Techniques that produce good confidence measurements, such as radial basis functions, are often challenged to meet the recognition performance or introduce too many errors.
FIG. 1 shows a general configuration of a classifier 1 used by the most common techniques such as Bayes, radial basis function (RBF), and standard backpropagation neural networks. In this configuration, a discriminant function (e.g. 2A) is associated with each possible output class. Each discriminant function 2A–2N converts a feature vector 3 to a single measurement. A decision stage 4 compares the outputs 5A–5N of all of the discriminant functions to determine the strongest output (e.g. 5B). The index-number 6 of this strongest output corresponds to the output class, while the value of this output corresponds to the confidence 7 that the classification is correct.
There are many possible forms of discriminant functions and the needed training data depends on the selected base discriminant function. Within each discriminant function 2A–2N are parameters that are computed prior to runtime operation in a training model. In the training mode, the internal parameters are computed from a “training set” of feature vectors. To compute the training data, numerous representative image samples are needed for each output-class. The image samples are converted to vector-samples for training by simulating the front end of the system. The training data is simply a set of statistics extracted from these sample vectors.
Prior art systems exist that yield an optimum classifier, and a somewhat useful confidence measurement. However, a computer-based implementation is faced with a trade-off. Computing the full equation is processing intensive. Reducing the equation requires sacrificing the validity of either the classification or the associated output confidence value. This patent directly addresses this trade-off issue.