Pattern recognition systems, loosely defined, are systems capable of distinguishing between various classes of real world stimuli according to their divergent characteristics. A number of applications require pattern recognition systems, which allow a system to deal with unrefined data without significant human intervention. By way of example, a pattern recognition system may attempt to classify individual letters to reduce a handwritten document to electronic text. Alternatively, the system may classify spoken utterances to allow verbal commands to be received at a computer console.
A typical prior art classifier receives an input pattern, associated with one of a plurality of classes and analyzes its relationship to each class via a mathematical classification technique. From this analysis, the system produces an output class and an associated confidence value. Although classification techniques vary, certain features are typical among them. In the course of analysis, the inputs are compared to known training data for each class. An associated confidence value is generated for each class, reflecting the likelihood of the class being the associated class. The class with the highest confidence value is selected, and the selected class and the associated confidence value are outputted.
The selection of the classifier technique is influenced by many factors. Among the more important of these is the “dimensionality” of the system. Dimensionality relates to both the number of possible classes into which a pattern may be classified and the number of features used to classify the input. The dimensionality of a system is directly tied to the application for which it is designed. In an English text recognition system, it is necessary to have a class for both forms of each letter as well as classes for each number. Thus the system will have around seventy classes. In an image recognition system used for identifying images such as stamps or labels, hundreds of classes may be necessary. Typically, increased numbers of features are necessary to deal with large output sets. As a result, the classification process for any system with high dimensionality is computationally intensive.
When dealing with a large number of output classes, it is typical for a number of these classes to be fairly common and a similar amount to be fairly rare, relative to the other classes. A system capable of acquiring and utilizing these relative probabilities could eliminate much of the unnecessary processing associated with these rare classes. Such a system could perform high dimensionality classifications considerably more quickly than a comparable prior art classifier.