1. Technical Field
The invention relates to a system for locating and incorporating new classes in a pattern recognition device or classifier. Image processing systems often contain pattern recognition devices (classifiers).
2. Description of the Prior Art
Pattern recognition systems, loosely defined, are systems capable of distinguishing between various classes of real world stimuli according to their divergent characteristics. A number of applications require pattern recognition systems, which allow a system to deal with unrefined data without significant human intervention. By way of example, a pattern recognition system may attempt to classify individual letters to reduce a handwritten document to electronic text. Alternatively, the system may classify spoken utterances to allow verbal commands to be received at a computer console.
A typical prior art classifier is trained over a plurality of output classes using a set of training samples. The training samples are processed, data relating to features of interest are extracted, and training parameters are derived from this feature data. As the system receives an input associated with one of a plurality of classes, it analyzes its relationship to each class via a classification technique based upon these training parameters. From this analysis, the system produces an output class and an associated confidence value.
In some applications, such as optical character recognition, the output classes stay substantially the same. In many others, however, new output classes often appear in the population of samples classified by the system. For these applications, it is frequently necessary to add new output classes to reflect changes in the data population. Similarly, over a period of operation, the classifier will be exposed to various noise patterns. In many cases, these noise patterns will be uncommon, and the system will deal with them appropriately by rejecting them. For many applications, however, particular noise patterns may reoccur. In such a case, dealing with this reoccurring pattern as a separate class will allow the system to identify and reject a significant source of unclassifiable patterns.
Absent some method of identifying new classes and reoccurring noise, the system will not be able to deal with the novel patterns effectively. This problem will continue until the new output class or reoccurring noise pattern is discovered by an operator. Accordingly, a number of input patterns will be classified incorrectly prior to discovery of the new class. It would be desirable to have a system that is capable of collecting and grouping reoccurring patterns to allow new output or noise classes to be identified.