In the field of neural network processors, there is currently much interest as the ability to create large processors in a small space using VLSI techniques has suddenly made these analog techniques viable for purposes such as solving complex problems in real time. Problems such as best path location are quickly solved using the analog approach of a neural network whereas the same problem would be both hardware and time intensive if approached using digital computation facilities in a parallel processing manner.
While a neural network processor does its computations rapidly, it does have one drawback relative to the digital approach. With a digital computer, the problem to be solved is described as a series of precise logical steps to be accomplished. These steps are programmed into a series of computer instructions which are then loaded into the computer or computers which are to perform the calculations. At run-time, the input parameters for the problem to be solved are provided to the computer(s) and the problem instructions are executed to provide an answer to the problem. There are no possible variation; that is, a digital computer operates as a brute force automaton executing the sequence of instructions provided. If anything happens which is not provided for in the instruction sequence, the program fails and no meaningful results are provided.
The neural network, on the other hand, is patterned after the human brain and must be taught by a learning process. A typical prior art neural network can appear in basic form as depicted in FIG. 1 where it is generally indicated as 10. The network 10 comprises a matrix of conductors 12 to which a number of neurons 14 are connected to provide inputs. The conductors 12 are interconnected by a number of synapses 16 defining the general rules of the problem to be solved. The neurons 14 are generally non-linear elements having an input 18 to which an analog signal representing a variable of the problem to be solved can be connected. The outputs 20 of the neurons 14 are connected to the conductors 12 on one side of the synapses 16. The conductors 12 of the other side of the synapses 16 provide the outputs 22 of the network 10 representing the solution (in analog form) to the problem being solved.
A complex problem may take the form of a multi-layer neural network such as that generally indicated as 10' in FIG. 2. In such a multi-layer neural network 10', the analog inputs defining the problem are input as a first set of neurons 14. The outputs from the first layer of the network 10' are input to a second set of neurons 14 at the input to the second layer of the network 10' and the outputs representing the solution to the problem are found at the outputs 22 from the second layer of the network 10'.
A neural network (single layer 10 or multi-layer 10') "learns" by experience. A first set of variables for the problem to be solved are input to the network 10, 10' and the outputs 22 representing the solution to the problem for the given parameters are inspected. Each of the synapses 16 is then adjusted as to its performance factors (weight) in the total problem. As a result of the "answer" provided, the weights of the synapses 16 are adjusted slightly, as necessary, in a manner which will tend to move the answer to the problem closer to the correct answer. This process is repeated over and over with a second, third, etc. set of variables until the synapses 16 have all been adjusted to the point where the proper answer to the problem is given for any set of variables which are input. At that point, the network 10,10' has learned by experience how to solve the particular problem.
One of the current issues in the theory of supervised learning concerns the scaling properties of neural networks. While low-order neural computations are easily handled on sequential or parallel processors, the treatment of high-order problems proves to be intractable. The computational burden involved in implementing supervised learning algorithms, such as back propagation, on networks with large connectivity or lo training sets is immense and impractical. Until the development of fully parallel hardware, the treatment of such applications as image recognition or pattern classification prove unwieldy to handle current algorithms. It is clear, therefore, that a more computationally efficient learning rule is required to deal with such applications.
Current neuromorphic models regard the neuron as a strictly passive non-linear element and the synapse, on the other hand, as the primary source of information processing insofar as "learning" is concerned. In these standard prior art models, information processing is performed by propagating within a network synaptically weighted neuronal contributions in either a feed-forward, feed-backward, or fully recurrent fashion. Information is contained in the synaptic weights and their rapid evaluation form the goal of these algorithms. Prior art artificial neural networks take the point of view that the neuron can be modeled by a simple non-linear "wire" type of device. The only prior art implementation of any adjustability to the neurons is depicted in FIG. 3. As depicted therein, the neurons 14 (i.e. implemented as non-linear elements as described above) can be manually adjusted en masse (as indicated by the dashed box 24) by means of a manual input device 26 under the control of a human operator. The neurons 14 each have an adjustable gain which is referred to in the art as the "temperature" of the neuron. Typically, the learning process begins with the temperature (i.e. the gain) of the neurons 14 at a high level. As the learning process proceeds, the operator may periodically and randomly begin to lower the temperature of all the neurons 14 simultaneously by means of the manual input device 26. This results in some measurable and meaningful improvement in the learning process of the network; that is, there is some decrease in learning time when the operator adjusts the temperature of the neurons 14.
Substantial evidence is beginning to emerge and be reported that information processing apparently occurs in biological neural networks (e.g. the human brain) at the neuronal level. If this is true, one can suppose that by providing a dynamically adaptable neuron element for use in artificial neural networks which adapts as part of the learning process along with the synapses, the learning process can be improved and the time therefor decreased significantly.