The invention relates to an associative neuron used in artificial neural networks.
In artificial neural networks, neurons derived from the McCullogh-Pitts (1943) neuron, such as different versions of the perceptron (Frank Rosenblatt 1957), are used. Neural networks are discussed, for example, in the article xe2x80x9cArtificial Neural Networks: A Tutorialxe2x80x9d by Anil K. Jain, Jianchang Mao and K. M. Mohiuddin in IEEE Computer, March 1996, p. 31 to 44.
In FIG. 1, signals X1, to Xn are inputs of an artificial neuron and Y is its output signal. The values of the input signals X1, to Xn can be continuously changing (analogous) or binary quantities, and the output signal Y can usually be given both positive and negative values. W1 to Wn are weighting coefficients, i.e. synaptic weights, which can also be either positive or negative. In some cases, only positive signal values and/or weighting coefficients are used. Synapses 111 to 11n of the neuron weight the corresponding input signal by the weighting coefficients W1 to Wn. A summing circuit 12 calculates a weighted sum U. The sum U is supplied to a thresholding function circuit 13, whose output signal is V. The threshold function can vary, but usually a sigmoid or a piecewise linear function is used, whereby the output signal is given continuous values. In a conventional neuron, the output signal V of the thresholding function circuit 13 is simultaneously the output signal Y of the whole neuron.
When neurons of this kind are used in artificial neural networks, the network must be trained, i.e. appropriate values must be found for the weighting coefficients W1 to Wn. Different algorithms have been developed for the purpose. A neural network that is capable of storing repeatedly supplied information by associating different signals, for example a certain input with a certain situation, is called an associative neural network. In associative neurons, different versions of what is known as the Hebb rule are often used. According to the Hebb rule, the weighting coefficient is increased always when the input corresponding to the weighting coefficient is active and the output of the neuron should be active. The changing of the weighting coefficients according to tie algorithms is called the training of the neural network.
From previously known artificial neurons, it is possible to assemble neural networks by connecting neurons in parallel to form layers and by arranging the layers one after the other. Feedback can be implemented in the networks by feeding output signals back as input signals. In wide networks assembled from neurons, however, the meaning of individual signals and even groups of signals is blurred, and the network becomes more difficult to design and manage. To produce an attention effect, for example, the network operations would have to be strengthened in one place and weakened in another, but the present solutions do not provide any clear answers to where, when and how this should be done, and in what way.
The object of the invention is to provide a method and equipment implementing the method in which the above problems of training a neural network can be solved. To put it more precisely, the object of the invention is to provide a mechanism by which useful additional information can be produced on the level of an individual neuron about the relations between the different input signals of the neuron. The mechanism must be flexible and versatile to make artificial neurons widely applicable. The mechanism must also be fairly simple so that the costs of manufacturing neurons can be kept low.
The object of the invention is achieved by a method and equipment that are characterized by what is stated in the independent claims. The preferred embodiments of the invention are claimed in the dependent claims.
The invention is based on expansion of a conventional neuron such that a specific expansion, i.e. nucleus, is attached to the conventional neuron, a specific main input signal, i.e. main signal, passing through the nucleus. The nucleus keys and adjusts the main signal by a signal obtained from the conventional part of the neuron, and forms between these signals logical operations and/or functions needed to control neural networks. The processing power of a single neuron is thus increased as compared with the previously known neurons, which process data only by means of weighting coefficients and threshold functions. On the other hand, a clear distinction between main signals and auxiliary signals makes neural networks easier to design, since the training according to the Hebb rule is then easy to implement in such a way that each weighting coefficient is increased always when the main signal and the auxiliary input signal concerned are simultaneously active.
On the basis of the main signal (S) and a non-linear signal (V), the function (So)S OR V is formed in the neuron of the invention and used to generate a main output signal, and in addition, at least one of the three logical functions YO=S AND V, NO=NOT S AND V, Na=S AND NOT V is formed and used to generate an additional output signal for the neuron.
The neuron of the invention and the network consisting of such neurons learn quickly: even one example may suffice. The operation of the neuron of the invention and that of the networks consisting of such neurons are simple and clear.