Increased usage of neural networks in a wide variety of applications has led to a substantial need for faster learning methods for determining connection weights in neural networks. Learning systems provide a way of deriving knowledge based upon a set of training examples rather than a set of rules. Learned knowledge evolves from the training examples dependent, in part, on the learning procedure being used.
Learning techniques using neural networks have been shown to be very useful for problem areas in which there are too many rules relating to the resolution of the problem, or in which all the rules are not known.
Neural networks have been used in many varied applications, such as visual pattern recognition, data compression, optimization techniques, expert systems applications as well as other situations involving complex problems or knowledge bases.
In visual pattern recognition a neural network learning system should be able to distinguish similar objects from each other, such as a tree from a utility pole, with a high degree of accuracy by showing a properly structured network enough examples. A system based on rules has a difficult time performing this relatively simple task because the person programming the rules cannot describe them in sufficient detail. Similar problems and potential neural network solutions exist with respect to speech recognition, particularly speaker-independent applications.
Learning systems also have been used to solve data encoder problems which involve "squeezing" data through a small number of hidden neurons in a neural network, and having the input data replicated at the output of the network. This implementation of neural network learning systems can be utilized to solve video compression problems involving vector quantization using a codebook lookup method, for example. The codebook used can be generated through the use of the learning system.
Neural network learning systems can be useful in finding optimal solutions for routing problems, for example, in communication applications involving traffic routing and scheduling, and in frequency allocation problems in cellular communications systems.
Expert knowledge sometimes is easier to encode by training examples rather than by providing a list of rules. Experts often do not know all the rules they use to trouble shoot equipment or give advice. A history of such action or advice can be used to develop a database of training examples to allow a system to "learn" the expert's knowledge. Furthermore, as knowledge changes, training often is a more efficient method of updating a knowledge base than changing the rules used. The knowledge base could be corrected or augmented without a new software release embodying the changed or different rules. This approach can be useful, for example, in the areas of medical diagnostics, common sense reasoning, trouble shooting procedures, installation instructions and software debugging.
The backpropagation algorithm, derived from the chain rule for partial derivatives, provides a gradient descent learning method in the space of weights, and has formed the foundation for many learning systems. The method of gradient descent can be improved by modifying it to a conjugate gradient method in which the current descent direction is determined by combining the gradient descent direction with an appropriate factor of the past descent direction. Generally, utilization of versions of the gradient descent learning algorithms in neural networks is based upon the selection of two parameters commonly known as the learning rate and the momentum factor. Calculation and determination of these parameters, however, often are cumbersome and not arrived at adaptively based on the progression of the learning method.