One type of self-learning neural network is known as a Boltzmann Machine. A Boltzmann Machine type neural network uses an algorithm known as the Boltzmann Algorithm to achieve learning. In the Boltzmann Machine, the synapses are symmetric. This means the connections between neurons run forward and backwards and with equal connection strengths in both directions. Thus, the weight of the synaptic connection between the output of neuron j and an input of neuron i is the same as the weight of the synaptic connection between the output of neuron i and an input of neuron j.
A neuron i which forms part of a Boltzmann machine is illustrated in FIG. 1. The neuron i has four inputs labeled 1, 2, 3 and 4. The input 1 is for a threshold current produced by the threshold current generator 10. (Typically, the threshold current generator is simply an unused neuron in the neural network.) The input 2 is a current w.sub.ji s.sub.j, where w.sub.ji is the weight of the synaptic connection 14 between the output of neuron j (not shown) and an input of the neuron i and s.sub.j is the output state of the neuron j. The weight w.sub.ji is formed by a weighting circuit 12 located in the synaptic connection 14. The input 3 is a current w.sub.ki s.sub.k where w.sub.ki is the weight of a synaptic connection 16 between the output of a neuron k (not shown) and an input of the neuron i and s.sub.k is the output state of the neuron k. The weight w.sub.ki is formed by a weighting circuit 18 located in the synaptic connection 16. In general, the neuron i receives a plurality of weighted input currents from other neurons but only two such inputs, i.e., 2 and 3, are shown in FIG. 1 for purposes of illustration.
The input 4 is a noise input. A noise current is generated by the noise generator circuit 20 and inputted to the neuron i via input 4. The noise input 4 is used for simulated annealing and is discussed in greater detail below.
The neuron i has a voltage output s.sub.i. The output s.sub.i can take on a range of values between two values "off" or "on" or "0" or "1" (See FIG. 5 for the values the output s.sub.i can take). In general, if the sum of the currents including the threshold current is less than zero, the neuron output s.sub.i is closer to the off state (0 volts in FIG. 5). If the sum of the currents including the threshold current exceeds zero, the neuron output s.sub.i is closer to the on state (5 V in FIG. 5).
As the network is symmetric, the output s.sub.i of neuron i is connected via the synaptic connection 22 to the neuron j. The synaptic connection 22 contains the weighting circuit 24 whose weight w.sub.ij is equal to w.sub.ji. The output s.sub.i of the neuron 24 is also transmitted via synapse 26 to the neuron k. The synaptic connection 26 includes the weight circuit 28 whose weight w.sub.ik equals w.sub.ki. The weights w.sub.ji, w.sub.ij are controlled by the control circuit 30. The control circuit 30 receives the output signals of the neurons, i and j, i.e., s.sub.i and s.sub.j, and, in response, outputs a signal to control the weights w.sub.ij and w.sub.ji. The weights w.sub.ki and w.sub.ik are controlled by the control circuit 31. The control circuit 31 receives the outputs s.sub.i and s.sub.k of the neurons i and k and outputs signals to control the weights w.sub.ik and w.sub.ki. In general, there is a control circuit to control the weight of each symmetric synapse in the network.
The control of the synaptic weights takes place as follows. Usually, a Boltzmann Machine type neural network has an input layer of neurons, an output layer of neurons and one or more hidden layers of neurons in between the input and output layers. FIG. 2 schematically illustrates a set of neurons 70 organized into an input layer 72, an output layer 74, and a hidden layer 76. The bi-directional synaptic connection between each pair of neurons is also illustrated in FIG. 2.
The Boltzmann learning algorithm works in two phases. In phase "plus" the neurons in the input and output layers are clamped to a particular pattern that is desired to be learned while the network relaxes through the use of simulated annealing or another technique. In phase "minus", the output neurons are unclamped and the system relaxes while keeping the input neurons clamped. (Note that the neuron i of FIG. 1 includes no clamping circuits, thus it is a neuron in a hidden layer). The goal of the learning process is to find a set of synaptic weights such that the learned outputs of the "minus" phase match the desired outputs in the "plus" phase as nearly as possible. The probability that two neurons i and j are both "on" in the plus phase, P.sub.ij.sup.+, can be determined by counting the number of times both neurons are activated averaged across some or all patterns (input-output mappings) in a training set. For each mapping, co-occurrence statistics are also collected for the minus phase to determine P.sub.ij.sup.-. Both sets of statistics are collected by the control circuit of the particular symmetric synapse after annealing. In the preferred implementation, the co-occurrence statistics are collected for one pattern as it is being presented rather then being collected for the entire training set so that a weight adjustment occurs after each pattern.
More generally, after sufficient statistics are obtained by the control circuit, the weights are updated according to the relation EQU .DELTA.w.sub.ij =.eta.(P.sub.ij.sup.+ -P.sub.ij.sup.-)
where .eta. scales the size of each weight change.
The simulated annealing technique involves perturbing the threshold signals of all neurons in a random fashion while clamping signals are applied to all of the neurons in one or both of the input and output layers of the network. As shown in FIG. 1, the perturbing random signal may be obtained from an electrical noise generator 20 connected to the neuron. By introducing noise there is introduced into the neural network a quantity analogous to thermal energy in a physical system. This "heat" is applied to the network to cause the network to visit all possible states. Then as the temperature (i.e., noise level) is reduced to some minimum, there is a high probability that the network will settle to its lowest energy state, i.e. a global minimum.
As an alternative to simulated annealing, a deterministic method known as the Mean Field Approximation (MFA) may be used. According to this method, the slope of a hyperbolic-tangent-like transfer function (See FIG. 5) of an amplifier used to implement the neuron is varied from zero to a maximum.
It is an object of the invention to provide a neuron for use in a self-learning neural network such as a Boltzmann Machine. It is a further object to provide a neuron for use in a self-learning neural network which can be used with the simulated annealing or Mean Field Approximation method for settling the network. It is a further object to provide a neuron which can cascade with many other neurons on a single VLSI chip to form a complete neural network on the chip, and also with other such similar chips, to form a multi-chip system.