Pattern classification using neural networks has found practical application in speech, vision, robotics and artificial intelligence where realtime response with realworld data is required. Currently, optical character recognition (OCR) is an area of interest for application of artificial intelligence and neural networks.
The use of a neural network usually involves two distinct procedures: initialization and training using data with known outputs, followed by recognition of actual unknown patterns. The network is first initialized by setting weights of the neural network elements to random values within certain ranges. The network is then successively presented with training patterns and the output is monitored for deviations from known desired output. In a typical neural network, every element must execute its function in order to get an output. The weights of the elements are then adjusted in a direction and by an amount that minimizes the total network error for the particular training pattern. Such a system is commonly known as a back propagation network.
A back propagation network has a hierarchical structure consisting of a minimum of three layers: an input layer, an output layer and a middle, or hidden, layer. The hidden layer usually consists of a number of individual layers with each layer fully connected to the following layer. The source of an error in the output layer is difficult to determine in a back propagation network. The error could be present if the output element itself has an incorrect weight, the inputs from the middle layer to the output layer are incorrect, or the output layer's weights and the inputs from the middle layer are incorrect. In order to distinguish between the three layers, many iterations, often hundreds or even thousands, may be required for the network to learn a set of input patterns. This may be acceptable for applications which do not require real time response, but in applications such as optical character recognition or speech recognition, real time response is a necessity. In addition, back propagation networks require so many elements that they are not practical for implementation on personal computers.
One alternative to back propagation classifiers is decision tree classifiers. Decision tree classifiers are hyperplane classifiers which require little computation for classification, and little memory, and have been used in many pattern classification applications. The size of the decision tree classifiers can be easily adjusted to match their complexity to the amount of training data provided. Training procedures gradually but rapidly build, or grow, trees. Thus, the time required for training the tree classifier is significantly reduced compared with that of a typical back propagation network. Decision trees, however, have been known to become very large, and the technology has tended away from trees.