1. Field of the Invention
The present invention relates to a learning method in a neural network having discrete interconnection strengths, such as an optical neuron computer.
2. Description of the Related Art
FIG. 1 illustrates the structure of a typical layered neural network. FIG. 2 is a flow chart which illustrates the operation of such a neural network. A layered network of the type described above has been disclosed, for example, in Primer on Neural Networks, pp. 55 to 57, edited by Amari and Mori, published by Trikeps Corporation, Nov. 30, 1985.
Referring to FIG. 1, reference numerals 1a, 1b, represent neurons which constitute an input layer of a neural network 2b, 2b , . . . , represent neurons which constitute an intermediate layer; 3a, 3b , . . . , represent neurons which constitute an output layer; 4a, 4b , . . . , represent inter-connections between the neurons in the input layer and those in the intermediate layer; 5a, 5b , . . . , represent inter-connections between neurons in the intermediate layer and those in the output layer 6a, 6b , . . . , represent input signals; and 7a, 7b , . . . , represent output signals.
In accordance with a total input I and threshold .theta., a single neuron transmits an output V given from the following equation: EQU V=f(I-.theta.) where f(I-.theta.)=1/[1+exp (-I- .theta.)]. (1)
Allowing the number of neurons in the input layer, the intermediate layer and the output layers to be n.sub.I, n.sub.H and n.sub.O, respectively, the output from the i-th neuron in each of the layers to be V.sup.I i, V.sup.H i and V.sup.O i, respectively, and the thresholds to be .theta..sup.I i, .theta..sup.H i and .theta..sup.O i, respectively, the outputs from the i-th neurons in the intermediate layer and the output layer can be respectively given from the following equations: ##EQU1## where W.sup.IH ij represents the interconnection strength between the i-th (i=1, 2, . . . , n.sub.H) neuron in the intermediate layer and the j-th (j=1, 2, . . . , n.sub.I) neuron in the input layer and W.sup.HO ij represents the interconnection strength between i-th (i=1, 2, . . . , n.sub.O) neuron in the output layer and the j-th (j=1, 2, . . . , n.sub.H) neuron in the intermediate layer.
Therefore, when an output V.sup.I =(V.sup.I 1, V.sup.I 2 . . . , V.sup.I n.sub.I is determined, in response to an input signal to the neurons 1a, 1b, . . . , which constitute the input layer, an output V.sup.O= (V.sup.O 1, V.sup.O 2 . . . , V.sup.O n.sub.O) from the neurons 3a, 3b, . . . , , which constitute the output layer, is determined from Equations (2) and (3).
Usually, "learning" in a neural network is a process of updating the interconnection strengths W.sup.IH ij and W.sup.HO ij in such a manner that the neurons 3a, 3b, . . . , in the output layer transmit a specific output V.sup.O(m) with respect to a certain output V.sup.I(m) from the neurons 1a, 1b, . . . , in the input layer. In the learning process, the above-described plurality of specific outputs V.sup.I(m) from the input layer and corresponding desired outputs V.sup.O(m) from the output layer are called "educator signals".
The operation of learning in such a neural network will now be described with reference to the flowchart of FIG. 2.
First, all of the initial values of the interconnection strengths are set by using random numbers (step ST1). The variable m (m=1, 2, . . . , M), for denoting the m-th educator signal is initialized (step ST2). Then, the actual output signal V.sup.O from the neural network, corresponding to V.sup.I(m) of the m-th educator signal, is calculated by using Equations (2) and (3) (step ST3). A quantity for updating each interconnection strength is then calculated by using the following equations (step ST4): ##EQU2## EQU .DELTA.W.sup.HO ij=-.alpha.V.sup.O i(1-V.sup.O i)V.sup.H j(V.sup.O i-V.sup.O(m) i) (5)
where .DELTA.W.sup.IH ij and .DELTA.W.sup.HO ij respectively represent the update quantity by which interconnection strengths W.sup.IH ij and W.sup.HO ij are updated and where coefficient .alpha.(0&lt;.alpha..ltoreq.1) represents the learning rate.
Then, each interconnection strength is updated according to the following equation by the update quantity obtained in step ST4 (step ST5): EQU W.sup.IH ij(new)=W.sup.IH ij(old)+.DELTA.W.sup.IH ij (6) EQU W.sup.HO ij(new)=W.sup.HO ij(old)+.DELTA.W.sup.HO ij (7)
For each educator signal m (M signals), the square error between the actual output signal V.sup.O hand the desired output V.sup.O(m), corresponding to V.sup.I(m) of the educator signal m, is calculated for each neuron in the output layer. If any of the sums of these square errors is less than a convergence criterion value .epsilon. (usually a value less than 0.01 n.sub.O is used), learning is ended (step ST6). If any of the above-described square errors exceeds .epsilon. and the educator signal which has been learned immediately before is the M-th signal (step ST7), the variable m is again initialized (step ST2). Then learning is repeated by incrementing the variable m by one until the actual outputs V.sup.O of the network, with respect to all V.sup.I(m) of the M educator signals are sufficiently approximate to all of the desired outputs V.sup.O(m) (step ST8).
In the conventional learning method for a neural network, the inter-connection strengths must be taken from a continuous range of values in order to be updated. Therefore, learning cannot be performed using this method in a neural network which has a discrete interconnection strength, such as in an optical neuro-computer.