The present invention generally relates to neuron units, neural networks and signal processing methods, and more particularly to a neuron unit which resembles neurons and is applicable to neural computers, a neural network which includes a plurality of such neuron units which are coupled to form a hierarchical network structure and a signal processing method which uses such a neural network.
In a living body, processes such as character recognition, memory by association and control of motion can be carried out quite simply. However, such processes are often extremely difficult to carry out on Neumann computers.
Hence, in order to cope with the problems encountered in the Neumann computers, various models of neuron units and neural networks have been proposed. The neuron unit resembles a neuron of the living body, and the neural network uses such neuron units which form a network so as to carry out parallel information processing and self teaching which are functions peculiar to a nervous system of the living body.
Presently, the neural network is in most cases realized by computer simulation. However, in order to bring out the advantageous features of the neural network, it is necessary to realize the parallel processing by hardware.
Some proposals have been made to realize the neural network by hardware, however, the proposed neural networks cannot realize the self learning function which is another advantageous feature of the neural network. Furthermore, the majority of the proposed neural networks are realized by analog circuits can suffer from the problems which will be described later in conjunction with the drawings.
First, a description will be given of a model of a conventional neural network. FIG. 1 shows one neuron unit 1, and FIG. 2 shows a neural network which is made up of a plurality of such neuron units 1. Each neuron unit 1 of the neural network is coupled to and receives signal from a plurality of neuron units 1, and outputs a signal by processing the received signals. In FIG. 2, the neural network has a hierarchical structure, and each neuron unit 1 receives signals from the neuron units 1 located in a previous layer shown on the left side and outputs a signal to the neuron units 1 located in a next layer shown on the right side.
In FIG. 1, T.sub.ij denotes a weight function which indicates the intensity of coupling (or weighting) between an ith neuron unit and a jth neuron unit. The coupling between first and second neuron units is referred to as an excitatory coupling when a signal output from the second neuron unit increases as a signal received from the first neuron unit increases. On the other hand, the coupling between the first and second neuron units is referred to as an inhibitory coupling when the signal output from the second neuron unit decreases as the signal received from the first neuron unit increases. T.sub.ij &gt;0 indicates the excitatory coupling, and T.sub.ij &lt;0 indicates the inhibitory coupling.
FIG. 1 shows the jth neuron unit 1 which outputs an output signal y.sub.j. When an output signal of the ith neuron unit 1 is denoted by y.sub.i, the input signal to the jth neuron unit 1 from the ith neuron unit 1 can be described by T.sub.ij y.sub.i. Since a plurality of neuron units 1 are coupled to the jth neuron unit 1, the input signals to the jth neuron unit 1 can be described by .epsilon.T.sub.ij y.sub.i. The input signals .epsilon.T.sub.ij y.sub.i to the jth neuron unit 1 will hereinafter be referred to as an internal potential u.sub.j of the jth neuron unit 1 as defined by the following equation (1). EQU u.sub.j =.epsilon.T.sub.ij y.sub.i ( 1)
Next, it will be assumed that a non-linear process is carried out on the input. The non-linear process is described by a non-linear neuron response function using a sigmoid function as shown in FIG. 3 and the following equation (2). EQU f(x)=1/(1+e.sup.-x) (2)
Hence, in the case of the neural network shown in FIG. 2, the equations (1) and (2) are successively calculated for each weight function T.sub.ij as as to obtain a final output.
FIG. 4 shows an example of a conventional neuron unit proposed in a Japanese Laid-Open Patent Application No. 62-295188. The neuron unit includes a plurality of amplifiers 2 having an S-curve transfer function, and a resistive feedback circuit network 3 which couples outputs of each of the amplifiers 2 to inputs of amplifiers in another layer as indicated by a one-dot chain line. A time constant circuit 4 made up of a grounded capacitor and a grounded resistor is coupled to an input of each of the amplifiers 2. Input currents I.sub.1, I.sub.2, . . . , I.sub.N are respectively applied to the inputs of the amplifiers 1, and output is derived from a collection of output voltages of the amplifiers 2.
An intensity of the coupling (or weighting) between the neuron units is described by a resistance of a resistor 5 (a lattice point within the resistive feedback circuit network 3) which couples the input and output lines of the neuron units. A neuron response function is described by the transfer function of each amplifier 2. In addition, the coupling between the neuron units may be categorized into the excitatory and inhibitory couplings, and such couplings are mathematically described by positive and negative signs on weight functions. However, it is difficult to realize the positive and negative values by the circuit constants. Hence, the output of the amplifier 2 is distributed into two signals, and one of the two signals is inverted so as to generate a positive signal and a negative signal. One of the positive and negative signals derived from each amplifier 2 is appropriately selected. Furthermore, an amplifier is used to realize the sigmoid function shown in FIG. 3.
However, the above described neuron unit suffers from the following problems.
(1) The weight function T.sub.ij is fixed. Hence, a value which is learned beforehand through a simulation or the like must be used for the weight function T.sub.ij, and a self-learning cannot be made.
(2) Because the signal intensity is described by an analog value of potential or current and internal operations are also carried out in the analog form, the output value easily changes due to the temperature characteristic, the drift which occurs immediately after the power source is turned ON and the like.
(3) When the neural network is formed by a large number of neuron units, it is difficult to obtain the large number of neuron units which have the same characteristic.
(4) When the accuracy and stability of one neuron unit are uncertain, new problems may arise when a plurality of such neuron units are used to form the neural network. As a result, the operation of the neural network becomes unpredictable.
On the other hand, as a learning rule used in numerical calculations, there is a method called back propagation which will be described hereunder.
First, the weight functions are initially set at random. When an input is applied to the neural network in this state, the resulting output is not necessarily a desirable output. For example, in the case of character recognition, a resulting output "the character is `L`" is the desirable output when a handwritten character "L" is the input, however, this desirable output is not necessarily obtained when the weight functions are initially set at random. Hence, a correct solution (teaching signal) is input to the neural network and the weight functions are varied so that the correct solution is output when the input is the same. The algorithm for obtaining the varying quantity of the weight functions is called the back propagation.
For example, in the hierarchical neural network shown in FIG. 2, the weight function T.sub.ij is varied using the equation (4) so that E described by the equation (3) becomes a minimum when the output of the jth neuron unit in the output (last) layer is denoted by y.sub.j and the teaching signal with respect to this jth neuron unit is denoted by d.sub.j. EQU E=.epsilon.(d.sub.j -y.sub.j).sup.2 ( 3) EQU .DELTA.T.sub.ij =.differential.E/.differential.T.sub.ij ( 4)
Particularly, when obtaining the weight functions of the output layer and the layer immediately preceding the output layer, an error signal .delta. is obtained using the equation (5), where f' denotes a first order differential function of the sigmoid function f. EQU .delta..sub.j =(d.sub.j -y.sub.j).times.f'(u.sub.j) (5)
When obtaining the weight functions of the layers preceding the layer which immediately precedes the output layer, the error signal .delta. is obtained using the equation (6). EQU .delta..sub.j =.epsilon..delta..sub.j T.sub.ij .times.f'(u.sub.j)(6)
The weight function T.sub.ij is obtained from the equation (7) and varied, where .DELTA.T.sub.ij ' and T.sub.ij ' are values respectively obtained during the previous learning, .eta. denotes a learning constant and .varies. denotes a stabilization constant. EQU .DELTA.T.sub.ij =.eta.(.delta..sub.j y.sub.i)+.varies..DELTA.T.sub.ij ' EQU T.sub.ij =T.sub.ij '+.DELTA.T.sub.ij ( 7)
The constants .eta. and .varies. are obtained through experience since these constants .eta. and .varies. cannot be obtained logically. The convergence is slower as the values of these constants .eta. and .varies. become smaller, and an oscillation tends to occur when the values of these constants .eta. and .varies. are large. Generally, the constants .eta. and .varies. are in the order of "1".
The neural network learns in the above described manner, and an input is thereafter applied again to the neural network to calculate an output and learn. By repeating such an operation, the weight function T.sub.ij is determined such that a desirable resulting output is obtained for a given input.
When an attempt is made to realize the above described learning function, it is extremely difficult to realize the learning function by a hardware structure since the learning involves many calculations with the four fundamental rules of arithmetics.
On the other hand, a neural network realized by digital circuits has been proposed in Hirai et al., "Design of Completely Digital Neuro-Chip", Electronic Information and Communication Society, ICD-88-130, Dec. 16, 1988.
FIG. 5 shows a circuit construction of a single neuron. In FIG. 5, each synapse circuit 6 is coupled to a cell circuit 8 via a dendrite circuit 8.
FIG. 6 shows an example of the synapse circuit 6. In FIG. 6, a coefficient multiplier circuit 9 multiplies a coefficient a to an input pulse f, where the coefficient a is "1" or "2" depending on the amplification of a feedback signal. A rate multiplier 10 receives an output of the coefficient multiplier circuit 9. A synapse weighting register 11 which stores a weight function w is connected to the rate multiplier 10.
FIG. 7 shows an example of the cell circuit 8. In FIG. 7, a control circuit 12, an up/down counter 13, a rate multiplier 14 and a gate 15 are successively connected in series. In addition, an up/down memory 16 is connected as shown.
In this proposed neural network, the input and output of the neuron circuit is described by a pulse train, and the signal quantity is described by the pulse density of the pulse train. The weight function is described by a binary number and stored in the memory 16. The input signal is applied to the rate multiplier 14 as the clock and the weight function is applied Go the rate multiplier 14 as the rate value, so that the pulse density of the input signal is reduced depending on the rate value. This corresponds to the term T.sub.ij y.sub.i of the back propagation model. The portion which corresponds to .epsilon. of .epsilon.T.sub.ij y.sub.i is realized by an OR circuit which is indicated by the dendrite circuit 7.
Because the coupling may be excitatory or inhibitory, the circuit is divided into an excitatory group and an inhibitory group and an OR operation is carried out independently for the excitatory and inhibitory groups. Outputs of the excitatory and inhibitory groups are respectively applied to up-count and down-count terminals of the counter 13 and counted in the counter 13 which produces a binary output. The binary output of the counter 13 is again converted into a corresponding pulse density by use of the rate multiplier 14.
A plurality of the neurons described above are connected to form a neural network. The learning of this neural network is realized in the following manner. That is, the final output of the neural network is input to an external computer, a numerical calculation is carried out within the external computer, and a result of the numerical calculation is written into the memory 16 which stores the weight function. Accordingly, this neural network does not have the self-learning function. In addition, the circuit construction of this neural network is complex because a pulse density of a signal is once converted into a numerical value by use of a counter and the numerical value is again converted back into a pulse density.
Therefore, the conventional neural network or neural network suffer from the problem in that the self-learning function cannot be realized by hardware.
Furthermore, the analog circuits do not provide stable operations, and the learning method using numerical calculation is extremely complex and is unsuited to be realized by hardware. On the other hand, the circuit construction of the digital circuits which provide stable operations is complex.