As the functionality of neural networks continues to be expanded, the applications for neural networks increase. For example, neural networks may be applied to pattern recognition applications such as character recognition, speech recognition, remote sensing, geophysical prospecting and medical analysis, as well as many other applications. For each of these applications, classification algorithms are available based on different theories and methodologies used in the particular area. In applying a classifier to a specific problem, varying degrees of success with any one of the classifiers may be obtained. To improve the accuracy and success of the classification results, different techniques for combining classifiers have been studied. Nevertheless, problems of obtaining a high classification accuracy within a reasonable amount of time exist for the present classifying combination techniques and an optimal integration of different types of information is therefore desired to achieve high success and efficiency.
To this end, combinations of multiple classifiers have been employed. In early combination techniques, a variety of complementary classifiers were developed and the results of each individual classifier were analyzed by three basic approaches. One approach uses a majority voting principle where each individual classifier represents a score that may be assigned to one label or divided into several labels. Thereafter, the label receiving the highest total score is taken as the final result. A second approach uses a candidate subset combining and re-ranking approach where each individual classifier produces a subset of ranked candidate labels, and the labels and the union of all subsets are re-ranked based on their old ranks in each subset. A third approach uses Dempster-Shafer (D-S) theory to combine several individual distance classifiers. However, none of these approaches achieve the desired accuracy and efficiency in obtaining the combined classification result.
Another example of combining multiple classifiers is a multisource connectionist pattern classifier having a Meta-Pi architecture. In the Meta-Pi architecture, a number of source-dependent modules are integrated by a combinational superstructure which is referred to as the Meta-Pi combinational superstructure because of the multiplicative functions performed by its output units. FIG. 1 illustrates an example of the Meta-Pi architecture. In this architecture, a signal is input to the module networks, Net.sub.1, Net.sub.2, . . . Net.sub.k which classify the input signals by a Meta-Pi network (Meta-Pi Net). Source-dependent module output units {.rho..sub.k,1, .rho..sub.k,2, . . . .rho..sub.k,c } of each of the module networks are linked to global outputs O.sub.1, O.sub.2, . . . O.sub.c via their respective Meta-Pi network output units M.sub..pi.1, M.sub..pi.2, . . . M.sub..pi.k. In the Meta-Pi training procedure, the source-dependent module output units {.rho..sub.k,1, .rho..sub.k,2, . . . .rho..sub.k,c } are trained on the desired task before the combinational superstructure is trained. Each source-dependent module output unit processes each training sample and presents a classification output to the Meta-Pi superstructure which performs a combinational function on the outputs of the source dependent modules. In other words, at least two different training methods are performed, which requires a significant amount of time and logistical overhead.
The Meta-Pi superstructure processes the training sample and produces a global classification output by forming a linear combination of the module outputs. By using a Meta-Pi back propagation training process tailored for the Meta-Pi network, the parameters (weights or connections) of the Meta-Pi network are adjusted to optimize the global outputs. Accordingly, the Meta-Pi network separately trains the source-dependent classifier modules and the Meta-Pi combinational superstructure. Since the overall training time for the Meta-Pi combinational superstructure is proportional to the number of source-dependent modules combined by the superstructure, a significant amount of training time typically results. Also, the Meta-Pi combinational superstructure requires the output states of its source-dependent modules to be included as part of its training, and the combinational superstructure therefore cannot be trained independent of its modules which further increases the training time and complexity. Even though other systems are known where it is possible to train the classifier modules and the combinational structures simultaneously on different processors (a source identification (SID) network for example), any reduction in the training time that results from the simultaneous training is offset by the decrease in the accuracy of the classification output.
Accordingly, it is desirable to provide a classification system for efficiently and accurately combining multiple representations of an input pattern. Further along these lines, it is desirable to apply the classification system to character recognition analysis which supports multiple input representations.