1. Field of the Invention
The present invention relates to parallel processing, and, more particularly, to learning methods in devices such as neural networks with hidden units.
2. Description of the Related Art
Attempts to understand the functioning of the human brain have led to various "neural network" models in which large numbers of neurons are interconnected with the inputs to one neuron basically the outputs of many other neurons. These models roughly presume each neuron exists in one of two states (quiescent and firing) with the neuron's state determined by the states of the connected input neurons (if enough connected input neurons are firing, then the original neuron should switch to the firing state). Some models provide for feedback so that the output of a neuron may affect its input, and in other models outputs are only fed forward.
In a feedforward neural network, one set of neurons are considered input units, another set of neurons are considered output units, and, optionally, other neurons are considered hidden units. In such a situation, input patterns would stimulate input units which in turn would stimulate various layers of hidden units which then would stimulate output units to form an output. The aspect of learning in the human brain can be mimicked in neural networks by adjusting the strength of the connections between neurons, and this has led to various learning methods. For neural networks with hidden units there have been three basic approaches: (1) competitive learning with unsupervised learning rules employed so that useful hidden units develop, although there is no external force to insure that appropriate hidden units develop; (2) prescription of the hidden unit structure on some a priori grounds; and (3) development of learning procedures capable of leading to hidden unit structure adequate for the problem considered. See generally D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning Internal Representations by Error Propogation," Parallel Distributed Processing: Exploration in the Microstructure of Cognition, Volume 1: Foundations, pp. 318-362 (MIT Press, 1986) where a backward pass learning method is described and which is called the "generalized delta rule". The generalized delta rule basically has a neural network learn the correct outputs to a set of inputs by comparing actual outputs with correct outputs and modifying the connection strengths by a steepest descent method based on the differences between actual outputs and correct outputs.
However, the generalized delta rule has the problems of standard implementation with electronic devices does not directly lead to a compact architecture. Rather, the generalized delta rule leads to an architecture that focuses on the forward pass much more than the backward pass, and it is not clear how the same structural units would be used for both the forward pass computations and the backward pass computations. Indeed, the generalized delta rule is most often viewed as implementable by EEPROMs.