Computer systems are ever increasingly employed to classify (e.g., recognize) objects. Classifiers or classification systems are typically employed to perform a variety of applications including identifying and/or classifying objects such as handwriting samples, medical images, faces, fingerprints, signals, automatic control phenomena, natural phenomena, nucleotide sequences and the like.
Classification systems often provide non-probabilistic outputs (e.g., object A belongs to class C). Such outputs are sufficient for some applications, but not others. Probabilistic outputs, also referred to as posterior probabilities, are required in many situations such as when combining classifiers with other knowledge sources, such as language models. The probabilistic output provides a probability for a class based on a given object or data point. Typically, this posterior probability is denoted by P(class|input).
One widely utilized, high-performance non-probabilistic classifier is the K nearest neighbor classifier (KNN). The KNN classifier is especially applicable for systems that operate with a relatively large numbers of classes (e.g., Asian handwriting recognition). As with other classifiers, the KNN classifier generates outputs that do not include probabilities. It would be advantageous to convert those outputs into usable probabilities. However, there fails to exist a suitable mechanism for converting the KNN classifier outputs into useful probabilities. Histogramming can sometimes be employed to generate probabilities from classifier outputs. However, there is often insufficient data to generate such probabilities with histogramming.
Another type of non-probabilistic classifier is a support vector machine (SVM). SVMs are a trainable classifier and are generally considered more accurate at classification than other classification methods in certain applications (e.g., text classification). They can also be more accurate than neural networks in certain applications such as reading handwritten characters for example. SVMs generate outputs that are uncalibrated values and do not include probabilities. Conventional approaches do exist to convert non-probabilistic outputs of SVM classifiers into probabilistic outputs. For example, a logistic function has been employed with SVM classifier outputs to convert the outputs into usable probabilities. However, the training speed of SVMs is often prohibitive for a large number of classes, so the KNN classifier is often preferred. However, conventional approaches fail to provide an adequate mechanism that convert non-probabilistic outputs of KNN classifiers into probabilistic outputs.
Probabilistic outputs can be produced by neural networks. A neural network is a multilayered, hierarchical arrangement of identical processing elements, also referred to as neurons. Each neuron can have one or more inputs and one output. The inputs are typically weighted by a coefficient. The output of a neuron is typically a function of the sum of its weighted inputs and a bias value (e.g., an activation function). The coefficients and the activation function generally determine the response or excitability of the neuron to input signals. In a hierarchical arrangement of neurons in a neural network, the output of a neuron in one layer can be an input to one or more neurons in a next layer. An exemplary neural network can include any suitable number of layers such as, an input layer, an intermediary layer and an output layer.
The utilization of neural networks typically involves two successive steps. First the neural network is initialized and trained on known inputs having known output values referred to as classifications. The network can be initialized by setting the weights and biases of neurons to random values, typically obtained via a Gaussian distribution. The neural network is then trained using a succession of inputs having known outputs referred to as classes. As the training inputs are fed to the neural network, the values of the coefficients (weights) and biases are adjusted utilizing a back-propagation technique such that the output of the neural network of each individual training pattern approaches or matches the known output to mitigate errors. Once trained, the neural network becomes a classifier to classify unknown inputs. By selecting a suitable function to specify cost of an error during training, outputs of a neural network classifier can be made to approximate posterior probabilities.