1. Technical Field
The invention relates to a method of identifying and correcting for output classes with a multi-modal distribution across feature space in a pattern recognition system. Image processing systems often contain pattern recognition devices (classifiers).
2. Description of the Prior Art
Pattern recognition systems, loosely defined, are systems capable of distinguishing between various classes of real world stimuli according to their divergent characteristics. A number of applications require pattern recognition systems, which allow a system to deal with unrefined data without significant human intervention. By way of example, a pattern recognition system may attempt to classify individual letters to reduce a handwritten document to electronic text. Alternatively, the system may classify spoken utterances to allow verbal commands to be received at a computer console. In order to classify real-world stimuli, however, it is necessary to train the classifier to discriminate between classes by exposing it to a number of sample patterns.
A typical prior art classifier is trained over a plurality of output classes using a set of training samples. The training samples are processed, data relating to features of interest are extracted, and training parameters are derived from this feature data. As the system receives an input associated with one of a plurality of classes, it analyzes its relationship to each class via a classification technique based upon these training parameters. From this analysis, the system produces an output class and an associated confidence value.
The above assumes, however, that each class has a single set of average features to which it can be compared. Some output classes are not so easily categorized. For example, while the capital letter “A” may constitute a single class in an optical character recognition system, each individual printed font will likely differ on at least some of the selected features. In some cases, this variance will be sufficient to make it impossible to distinguish the variant fonts using the original class parameters. Accordingly, it would be desirable to identify situations where samples within a single class have multiple sets of varying characteristics and account for this multimodal distribution in the classification analysis.