1. Field of the Invention
The present invention relates to a neural network, and more particularly, it relates to an integrated circuit device for implementing a Boltzmann machine which has a learning function (self-organizability).
2. Description of the Background Art
In recent years, various calculation processing techniques have been proposed on the models of vital cells (neurons). One of such processing techniques is a parallel information processing machine called a Boltzmann machine. The Boltzmann machine is a sort of a mathematical model representing a neural network, which was proposed in 1985 by D. H. Ackley, G. E. Hinton and T. J. Sejnowsky. The feature of the Boltzmann machine resides in stochastic state transition of neurons. The source of this naming is as follows: When a connection strength (hereinafter referred to as a synapse load) Wij between neurons is symmetrical (Wij=Wji) with no self coupling (Wii=0), a state distribution p(z) of the system (neural network) is in the following Boltzmann distribution with a finite temperature representation parameter T of the system: EQU p(z)=C.multidot.exp(-U(z)/T)
where U(z) represents the potential function of the system, z represents the state of the system of the neurons, and C represents a normalization coefficient.
Through introduction of the above probability with respect to state determination of each neuron, it is expected that the neural network system converges to a global minimum value without being captured by a local minimum value of state energy. In other words, it is expected that a more probable solution can be found. Such a Boltzmann machine is suitable for solution of a complicated cost problem, non-algorithmic problems in pattern recognition, associative storage and the like, in contrast to a Turing machine which requires algorithms for solving problems. Therefore, the Boltzmann machine requires no programs, which have been indispensable to a von Neumann computer. The Boltzmann machine shows behavior similar to sensible judgement of a man, such that it solves a problem simply through data. Thus, the Boltzmann machine is expected to greatly contribute to development of industry.
In general, the Boltzmann machine has been simulated through a von Neumann computer in accordance with programs. In such a case, probabilities have been expressed in pseudo-random numbers, and states of neurons have generally been provided in discrete time expression. This is because the von Neumann computer is restricted to a serial processing system, and it is necessary to repeat a number of operations until respective neurons reach thermal equilibrium states. Further, it is generally predicted in principle that a state convergence time of the overall system is increased in proportion to the square of the number of neurons when the system is in full connection. Thus, a neural network of a practical scale, including at least 1000 neurons, requires an impractical computation time, to cause difficulty in practical application.
To this end, there has been developed a relatively high-speed simulator by connecting a general computer with dedicated hardware, which can operate state transition of neurons at a high speed. However, it is inefficient in principle to apply a serial processing computer to simulation of a neural network, which originally operates in parallel. Hence, it is still difficult to implement a practically applicable simulator.
Thus, awaited is implementation of a device which can efficiently represent a Boltzmann machine of a practically applicable scale at a high speed. If such a device can represent a strong parallel processing system which simulates the operation of a neural network, it is expected that its miraculous convergence time enables practical application to a number of new fields such as that of a real time controller (substitute for a skilled pilot), for example, while its useful feature of structural stability is exhibited in a unit system including the parallel processing system, thereby providing a highly reliable unit.
In order to implement the Boltzmann machine of such a practical level, it is requisite to implement a strong parallel processing system. To this end, it is necessary to build a neural network by preparing and interconnecting/arranging a plurality of functional units representing neurons whose states are stochastically transitive, a plurality of functional units representing synapse loads between the neurons, and a plurality of functional units deciding or correcting the synapse loads arbitrarily or in accordance with learning rules as those of portions to operate in parallel, while setting an input/output unit in response to information to be processed with respect to the neural network.
Some attempts have been made to implement such types of Boltzmann machines by semiconductor integrated circuits. Before explaining the structures and operations of conventional integrated semiconductor neural networks, the principle of operation of the Boltzmann machine is now described in more detail.
FIG. 1 shows the structure and the principle of operation of a general neuron model. Referring to FIG. 1, a neuron unit i includes an input part A which receives output signals S.sub.k, S.sub.j and S.sub.l from other units k, j and, a conversion part B which converts signals from the input part A in accordance with predetermined rules, and an output part C which outputs signals from the conversion part B. The input part A has weights (hereinafter referred to as synapse loads) indicating connection strengths with respect to the units k, j and l. For example, an output signal Sk from the unit k is converted to Wik.multidot.Sk with a synapse load Wij in the input part A, and transferred to the conversion part B. When the total of input signals received from the input part A satisfies certain condition, the conversion part B is fired to output a signal. The input part A of this neuron unit model corresponds to a dendrite of a vital cell, while the conversion part B corresponds to the body of the nerve cell and the output part C corresponds to an axon.
In this neuron model, it is assumed that each neuron takes two states of Si=0 (non-fired state) and Si=1 (fired state). Each neuron unit updates its state in response to the total input thereof. The total input of the unit i is defined as follows: ##EQU1## Symmetrical synapse coupling of Wij=Wji is assumed here while -Wii corresponds to the threshold value of the unit i.
The state of the neuron unit is asynchronously and stochastically updated between the units. When the unit i updates its state, the new state is "1" in the following probability: EQU p(Si=1)=1/(1+exp(-Ui/T))
where T represents a parameter which serves as a temperature in a physical system. This parameter takes a positive value, and is generally called a "temperature".
FIG. 2 shows the relation between the total input Ui with respect to each temperature T and the probability p(Si=1). Each unit i takes the value of either "0" or "1" in a probability of 1/2 substantially at random when the temperature T is high, while it substantially decision-theoretically complies with such threshold logic that the state of the total input goes to "1" when the total input exceeds a certain threshold value if the temperature T approximates zero.
The state of the Boltzmann machine at a certain time instant is represented in combination of ONs (S=1) and OFFs (S=0) of all units. With respect to a certain state .alpha. energy E is defined as follows: ##EQU2## In the above equation, it is assumed that the threshold value of each neuron unit is zero. This state can be realized by preparing each unit from that which is regularly in an ON state (S=1) and setting the same so that its connection strength is equal to its threshold value while its sign is reverse thereto.
When each neuron unit starts from an arbitrary initial state and continues its operation, the Boltzmann machine approaches a stochastic equilibrium state which is determined by the synapse loads W (hereinafter the synapse loads are simply denotes by W) of the respective units. In this case, as hereinabove described, the Boltzmann machine takes the state .alpha. in the following probability: EQU P(.alpha.)=C.multidot.exp(-E.alpha./T)
The Boltzmann machine uses a technique called simulated annealing, in order to reach a global minimum energy value. A relative probability of two global states .alpha. and .beta. is expressed as follows: EQU P(.alpha.)/P(.beta.)=exp(-E(.alpha.-E.beta.)/T)
The minimum energy state regularly has the highest probability at an arbitrary temperature. In general, it takes a long time to reach the thermal equilibrium state, and hence it is considered preferable to start the annealing from a high temperature and gradually reduce the annealing temperature. This state transition is similar to such a state that each crystal atom is transferred to a position for taking the minimum energy state at a given temperature in a crystal lattice.
A subject in the Boltzmann machine is to find such weights that the network itself can realize its distribution as correct as possible with no external supply of probability distribution of input/output data. Consider that units of a network K are divided into a set A of input/output units (hereinafter referred to as visible units) and a set B of other units (hereinafter referred to as hidden units).
It is assumed that the set A of the visible units enters a state .alpha. in a probability distribution P.sup.- (.alpha.) when the network K is externally supplied with no probability distribution. Further, the state of the overall set (network) K is expressed as .alpha.+.beta. when the set A is in the state .alpha. and the set B of the hidden units is in the state .beta.. In this case, the following equation holds: ##EQU3## where E.alpha..beta. represents the energy of the network K in the state .alpha.+.beta.. Assuming that Si(.alpha..beta.) represents the state of a unit i in the state .alpha.+.beta. of the network K, the energy of the network K is expressed as follows: ##EQU4## In the aforementioned network model, probability distributions of input units are not separated from those of output units. Such a network is called a self-associative Boltzmann machine. In a network called an inter-associative Boltzmann machine, on the other hand, it is necessary to separate the probability distributions of states of input units from those of states of output units originally supplying inputs.
When the network K is externally supplied with inputs/outputs, the set A of the visible units enters the state .alpha. in a probability distribution P.sup.+ (.alpha.). This probability distribution P.sup.+ (.alpha.) is independent of the synapse load Wij. The subject is to evaluate such a synapse load Wij that minimizes the difference between the probability distributions P.sup.+ (.alpha.) and P.sup.- (.alpha.). That is, evaluated is such a synapse load Wij that minimizes the following Kullback information measure: ##EQU5## with respect to the probability distributions P.sup.+ (.alpha.) and P.sup.- (.alpha.). The following equation results from the equation (3), to be used as a basic equation of the learning rules: EQU .differential.G/.differential.Wij=-(1/T)(p.sup.+ ij-p.sup.- ij)
The above equation is generally provided in the following form: EQU .DELTA.Wij=.eta..multidot.(p.sup.+ ij-p.sup.- ij) (4)
where p.sup.+ ij represents such an expected value that both of the neuron units i and j enter states "1" when the network K is externally supplied with educator information and enter equilibrium states. On the other hands, p.sup.- ij corresponds to an expected value in such case that no educator information is externally supplied. In the above equation (4), the term p.sup.+ ij means that the connection strength Wij between the adjacent units i and j is increased when both of the units are activated. This corresponds to a learning mechanism, called a Hebb's learning rule, for strengthening synapse coupling.
On the other hand, the term p.sup.- ij means that the connection strength Wij between the adjacent units i and j is reduced when both of the units are activated with no external supply of input/output. This is generally called Hebb's anti-learning.
A learning algorithm in the Boltzmann machine will now be briefly described.
The learning algorithm in the Boltzmann machine includes an operation 1 (plus (+) phase), an operation 2 (minus (-) phase) and an operation 3.
Operation 1 (plus (+) phase):
States of input units and output units (visible units) are clamped at specific patterns shown by input data and output data (educator data) in accordance with appearance probabilities of respective patterns. The operation 1 (plus phase) includes (1) an annealing process, (2) a data collecting process and (3) a process of evaluating p.sup.+ ij. In the annealing process (1), the state of each unit is changed for each temperature T in accordance with the following equations (5) and (6): ##EQU6##
The equation (5) expresses an energy gap between a state Si of "0" and a state Si of "1" of the unit i with respect to the energy E of the overall neural network. The equation (6) expresses such a probability that a new state Si of the unit i takes the value "1" when such an energy gap takes place. In the annealing process (1), the temperature T is successively converted from a high level to a low level. It is assumed that the network is relaxed to the minimum energy state and reaches a thermal equilibrium state when the temperature T is converted to a low level and a prescribed annealing procedure is terminated.
In the data collecting process (2), evaluated is such a number of times that both of respective states S of coupled units are "1" after the annealing process (1) is repeated by a prescribed number of times.
In the process (3) of evaluating p.sup.+ ij, an average value of data obtained in the process (2) is evaluated after the annealing process (1) and the data collecting process (2) are repeated by prescribed numbers of times in correspondence to received educator information, and this average value is assumed to be p.sup.+ ij.
The operation 2 (minus phase) similarly includes an annealing process (1), a data collecting process (2) and a process (3) of evaluating p.sup.- ij. The processes (1), (2) and (3) are similar to those in the operation 1 (plus phase). In the operation 2 (minus phase), however, only the states of units corresponding to input data are clamped in accordance with the appearance probability of the educator data. In this operation 2, an average value evaluated in the process (3) after the processes (1) and (2) are repeated is assumed to be p.sup.- ij, similarly to the operation 1.
In the operation 3, the synapse load Wij is varied in accordance with the following relational expression: EQU Wij=.eta..multidot.(p.sup.+ ij-p.sup.- ij)
from the evaluated values p.sup.+ ij and p.sup.- ij. Symbol .eta. represents a positive constant which determines the degree of a single variation of the synapse load Wij. The variation process of the synapse load Wij is determined by only the states of the two units i and j which are coupled with each other, as clearly understood from the above equation. The final object of the learning is minimization of the value G in the equation (3), i.e., minimization of the above value .DELTA.Wij such that the same ideally converges to zero.
As to the aforementioned variation of the synapse load Wij, the following relational expression may be employed in an electronic circuit, in order to simplify its structure: ##EQU7##
"A Neuromorphic VLSI Learning System" by J. Alspector et al., "Advanced Research in VLSI 1987" issued by MIT Press, p. 313 to 327, discloses an exemplary device which implements the aforementioned Boltzmann machine by an integrated circuit. The structure and operation of such a conventional semiconductor neural network device implementing the Boltzmann machine will now be described.
FIG. 3 shows an exemplary structure of the neural network. Referring to FIG. 3, the neural network includes complementary input data line pairs IN1 and IN1 to INj and INj, which are arrayed in the row direction, and complementary data output line pairs S1 and S1 to Sj and SJ, which are arrayed in the column direction. The neural network is further provided with differential amplifiers Z1 to Zj, which detect and amplify potential differences on the input line pairs INk and INk (k=1 to j) and transfer the same to the data output lines Sk and Sk. The respective differential amplifiers Zk correspond to neurons, while the data input lines IN and IN correspond to dendrites and the data output lines S and S correspond to axons. Synapse loads W are supplied by resistive elements R, which are arranged on crosspoints between the data input lines IN and IN and the data output lines S and S.
In the Boltzmann machine model, the synapse loads W have such a symmetric property that Wij=Wji. Thus, the differential amplifiers Zk are arranged on a diagonal line of a connection matrix formed by the data input lines, the data output lines and the resistive elements.
The differential amplifiers Zk have complementary outputs S and S. When the neurons are in ON states, the S outputs of the differential amplifiers Zk are "1" (5 V). When the neurons are in OFF states, on the other hand, the S outputs are "0" (0 V). The outputs of the differential amplifiers Zk are fed back to the data input lines IN and IN through the resistive elements R, indicating the synapse loads, which are arrayed in the matrix. A resistive element R, which is arranged on the i-th row and j-th column in the connection matrix, connects the output of the differential amplifier (neuron) Zj to the input of the differential amplifier (neuron) Zi. When the synapse load Wij is positive, therefore, the data output line Sj is connected to the data input line INi, and the data output line Sj is connected to the data input line INi. When the synapse load Wij is negative, on the other hand, the data output line Sj is connected to the data input line INi, and the data output line Sj is connected to the data input line INi.
A differential amplifier Zt, which is provided in a region V of the connection matrix, is regularly in an ON state, while an output line Sv is regularly supplied with "1" and an output line Sv is regularly supplied with "0". Such a structure eliminates influence by a threshold value in each neuron unit, and each threshold value is equivalently set at zero.
This network is initialized by setting the weight (resistance value) of each resistive element R. Data of the synapse load Wij is transferable along arrows shown in FIG. 3 through a weight processing circuit which is provided in correspondence to each resistive element R, as hereinafter described.
FIG. 4 shows the structure of each synapse load part (resistive element). The synapse load part includes four transistor groups TR1, TR2, TR3 and TR4, in order to provide positive coupling (excitatory coupling) and negative coupling (inhibitory coupling). Each of the transistor groups TR1 to TR4, which are identical in structure to each other, includes n MOS (metal-oxide-semiconductor) transistors T.sub.0 to T.sub.n-1 and one pass transistor TG. ON resistances of the MOS transistors T.sub.0 to T.sub.n-1 are set as 1:2: . . . :2.sup.n-1, in order to provide different resistance values. The pass transistors TG1 and TG4 receive a signal TSGN indicating the sign of the synapse load, while the pass transistors TG2 and TG3 receive a complementary signal TSGN. The signals TSGN and TSGN are complementary signals, which determine whether the sign of the synapse load is positive or negative. When the synapse load Wij is positive, the signal TSGN is "1", and the transistor groups TR1 and TR2 provide the synapse load Wij. When the synapse load Wij is negative, on the other hand, the complementary signal TSGN is "1", and the transistor groups TR2 and TR3 provide the synapse load Wij. The synapse load is set by turning one or more MOS transistors in each transistor group TR on through an output from a weight processing circuit, as hereinafter described.
FIG. 5 shows the structure of each differential amplifier Z forming a neuron. The differential amplifier Z includes p-channel MOS transistors PT1 and PT2, and n-channel MOS transistors NT1, NT2, NT3 and NT4. The p-channel MOS transistors PT1 and PT2 provide a pair of differential outputs, and transfer complementary data to data output lines S and S. The n-channel MOS transistors NT1 and NT2 provide a first differential input pair, while the n-channel MOS transistors NT3 and NT4 provide a second differential input pair. The first differential input pair differentially amplifies the potential difference on data input lines IN and IN to provide an energy gap ##EQU8## while the second differential input pair generates an annealing temperature T in the form of a noise. The second differential input pair receives a complementary output from an amplifier AZ, which receives a noise from a noise source NS. The level of a noise signal from the amplifier AZ is reduced with progress of a certain phase. Hence, such a process is realized that annealing is started at a high temperature and the annealing temperature is successively reduced so that the neural network is not captured by a pseudo optimum solution (local minima) but stabilized to a global minimum value. In general, the amplifier AZ is formed by an operational amplifier, whose gain is externally adjusted to set the annealing temperature.
In order to adjust input/output characteristics of the neuron (differential amplifier), a further n-channel MOS transistor NT5 is provided for receiving a prescribed bias potential V.sub.bias at its gate.
FIG. 6 shows an exemplary structure of the weight processing circuit. Referring to FIG. 6, the weight processing circuit includes a correlation logic CL, an up/down logic UDL, and flip-flops FF0 to FFn. In the Boltzmann model, the synapse loads have such a symmetrical property that Wij=Wji. Therefore, the weight processing circuit is commonly provided for the symmetrical synapse loads Wij and Wji. The flip-flops FF0 to FFn control ON and OFF states of the MOS transistors representing the synapse loads. The flip-flop FF0 stores information indicating the sign of the synapse loads, and controls on-off operations of the pass transistors TG. The flip-flops FF1 to FFn control on-off operations of the MOS transistors T.sub.0 to T.sub.n-1.
The correlation logic CL counts the numbers of a signal indicating the phase in operation and a signal COOC indicating such time when both of the outputs Si and Sj of the neuron units (differential amplifiers) Zi and Zj are "1", and evaluates a probability distribution Pij. When a weight adjusting signal ADW is received, the correlation logic CL supplies a signal indicating an increment, a decrement or holding (silent state) to the up/down logic UDL through the evaluated probability distribution Pij in accordance with the equation (4).
In response to the increment/decrement indication signal from the correlation logic CL, the up/down counter UDL increments, decrements or leaves intact the count value, and transfers the same to the flip-flops FF0 to FFn. The up/down logic UDL has the structure of a shift register, which can receive synapse load data W from an up/down logic included in an adjacent weight processing circuit in initialization, and can transfer the data to another up/down logic included in another adjacent weight processing circuit.
FIG. 7 shows an exemplary structure of the up/down logic UDL. In the structure shown in FIG. 7, a synapse load W is expressed in four bits (including one sign bit). FIG. 7 shows no path for setting weight data from the adjacent weight processing circuit as shown in FIG. 6. The up/down logic UDL is formed by an updown counter 100'. This updown counter 100' comprises a terminal U/D which receives a signal indicating incrementation/decrementation of the count value, a terminal T which receives a signal providing variation timing for the count value, a reset terminal R, and data output terminals Q0 to Q3. Outputs of the output terminals Q0 to Q2 provide the level of the synapse load W, and the output terminal Q3 outputs data defining the sign of the synapse load W. The output data from the output terminal Q3 is transferred through an inverter I1. Signal lines 103' to 106' are coupled to the flip-flops FF0 to FFn (n=3) respectively. In accordance with the incrementation/decrementation indication signal transferred through the signal line 102', the updown counter 100' increments, decrements or holds its count value when a timing signal is received through the signal line 101'. The synapse load is learned by this operation. A signal on a signal line 107' resets the updown counter 100'.
A threshold value processing operation of one neuron (differential amplifier) Zi is now described with reference to FIG. 8. Referring to FIG. 8, one transistor TC typically shows a conductance representing each synapse load W. When the synapse loads W are positive and the neurons are in ON states (differential amplifier outputs S are "1") or the synapse loads W are negative and the neurons are in OFF states, voltages V.sub.on and V.sub.off are transferred to the data input lines IN and IN through conductances on Ist and IVth rows. Conductances on IInd and IIIrd rows are those developed when the synapse loads W are negative and the neurons are in ON states or the synapse loads W are positive and the neurons are in OFF states (differential amplifier outputs S are "0").
The voltage V.sub.off is transferred to the data input line IN, and the voltage V.sub.on is transferred to the data input line IN. The positive input of the differential amplifier Zi is coupled with a conductance which is pulled up to the voltage V.sub.on and a conductance which is pulled down to the voltage V.sub.off. The conductance which is pulled up to the voltage V.sub.on is provided by the absolute value of the sum of positive synapse loads W from neurons which are in ON states. A conductance, which is provided by the absolute value of the total sum of positive synapse loads W from OFF-state neurons, pulls down the potential at the positive input of the differential amplifier Zi to V.sub.off. The relation between conductances in a negative input (-) of the differential amplifier Zi is evaluated from the relation, which is reverse to the same, in a positive input of the differential amplifier Zi. Considering the aforementioned relations with the fact that the synapse loads in the region V of FIG. 3 are provided as -.theta.i, the differential amplifier Zi simply performs the following comparison: ##EQU9## The differential amplifier Zi performs threshold value processing in accordance with the above expressions, and outputs the data to the data output lines S and S.
In the aforementioned structure, it is also possible to obtain desired output data by simply comparing the value of the positive input (+) of the differential amplifier Zi with a prescribed threshold value (V.sub.on +V.sub.off)/2. FIG. 9 shows another structure.
The structure shown in FIG. 9 is described in "Electronic Implementation of Neuromorphic System" by Jack I. Ruffel, IEEE 1988 Custom Integrated Circuit Conference, pp. 10.1.1. to 10.1.7.
Referring to FIG. 9, a circuit part (hereinafter referred to as a synapse polarity converting circuit) for outputting the product of a load and Sj of a data output line includes a synapse load representing part representing a synapse load Wij in accordance with data stored in a register 200', and a constant current circuit 210'. The register 200' has unit registers B0 to B2 which determine the value of the synapse load Wij, and a unit register B3 which outputs data for setting the sign of the synapse load Wij. The register 200' corresponds in structure to the flip-flops FF0 to FFn shown in FIG. 6.
The synapse load representing part has n-channel MOS transistors 201'a to 201'd, 202'a to 202'd and 203'a to 203'd. Data of the registers B0 to B2 are transferred to the gates of the MOS transistors 201'a, 202'a and 203'a through signal lines 200'a to 200'c respectively. An output signal of the data output line (axon signal line) Sj is transferred to the gates of the MOS transistors 201'b, 202'b, 203'b, 201'c, 202'c and 203'c. Output data from the register B3 is transferred to the gates of the MOS transistors 201'd, 202'd and 203'd through a signal line 200'd. The MOS transistors 201', 202' and 203' are set in the transistor width (gate width) ratio 1:2:4, whereby the MOS transistors 201', 202' and 203' are set in the conductance ratio 1:2:4. The transistor 201' is the generic name for the MOS transistors 201'a to 201'd. This also applies to the transistors 202' and 203'.
The constant current circuit 210' includes n-channel MOS transistors NT10 and NT11 and p-channel MOS transistors PT11 and PT12. The MOS transistor NT10 includes a diode-connected n-channel MOS transistor NT10b, and an n-channel MOS transistor NT10a whose gate is connected to the gate of the MOS transistor NT10b. Similarly, the n-channel MOS transistor NT11 has a diode-connected MOS transistor NT11b, and a MOS transistor NT11a. The p-channel MOS transistor PT11 has a diode-connected p-channel MOS transistor PT11a, and a MOS transistor PT11b. The MOS transistor PT12 includes a diode-connected p-channel MOS transistor PT12a, and a p-channel MOS transistor PT12b. The MOS transistor pairs NT10, NT11, PT11 and PT12 form a current mirror type constant current circuit. Therefore, a current I.sub.0 on a signal line 211'a, which is outputted from the constant current circuit 210', is equal to the sum of a current flowing on a signal line 211'b and that flowing on a signal line 212' (data input line IN). The signal line 211'a is connected to first conducting terminals of the MOS transistors 201'a, 202'a and 203'a, while the signal line 211'b is connected to first conducting terminals of the MOS transistors 201'd, 202'd and 203'd. The operation is now briefly described.
First, it is assumed that the potential signal Sj on the data output line is at a high level and the corresponding neuron is in an ON state.
When the synapse load Wij is positive, a low-level signal potential is transferred form the register B3 onto the signal line 200'd. Thus, all of the MOS transistors 201'd, 202'd and 203'd are in OFF states. In response to synapse load data from the registers B0 to B2, one of the MOS transistors 201'a, 202'a and 203'a enters an ON state. Thus, the current flowing on the signal line 211'a flows to the ground potential through the MOS transistor which is in an ON state. Consequently, the current I.sub.0 flowing on the signal line 211'a is increased, while the current I.sub.i flowing on the signal line 211'b remains unchanged. Thus, a current (I.sub.0 -I.sub.i) appearing on the signal line 212' is increased in correspondence to the increase in the current flowing on the signal line 211'a, to indicate the degree Wij.multidot.Sj of coupling between neurons j and i.
When the synapse load Wij is negative, on the other hand, the register B3 outputs a high level and all of the MOS transistors 201'd, 202'd and 203'd enter ON states, thereby opening a path for receiving the current from the signal line 211'b. In this state, the current is extracted from the signal line 211'b to the ground potential in an amount identical to that extracted when all of the MOS transistors 201'a, 202'a and 203'a are in ON states. The current (I.sub.0 -I.sub.i) on the signal line 212' is reduced by such increase of the current I.sub.i flowing on the signal line 211'. In this state, the synapse load is subjected to an offset in the negative direction by the absolute value of the maximum value of a state provided by the product of the load Wij and the output Sj of the neuron j. Thus, negative Wij can be represented.
When the output Sj of the neuron j is at a low level and the neuron j is in an OFF state, all of the MOS transistors 201'b, 201'c, 202'b, 202'c, 203'b and 203'c enter OFF states, and hence no path is opened to extract the currents from the signal lines 211'a and 211'b. Therefore, the currents I.sub.0 and I.sub.i flowing on the signal lines 211'a and 211'b are equal to each other, and the current value (I.sub.0 -I.sub.i) appearing on the signal line 212' is zeroed to represent a state of Wij.multidot.Sj=0.
Thus, the output state of the neuron unit i is determined by connecting a plurality of synapse representing circuits in parallel with the signal lines 211a and 211b and comparing the current flowing on the signal line 212' with the threshold value .theta..sub.i.
In the aforementioned structure of the integrated semiconductor neural network, the number of neuron units is still extremely small, and the respective functional blocks are formed in single chips, which are interconnected to provide one neural network.
In order to arbitrarily represent both of excitatory and inhibitory coupling states in representation of synapse loads, the conventional Boltzmann machine provided with a learning function requires four signal lines S, S, IN and IN as shown in FIG. 3, for each neuron as signal lines (axons) showing the state of the neuron and input lines (dendrites) transferring the data supplied to the neuron. Thus, the number of wires is so increased that interconnection in the neural network is complicated with difficulty and the area of interconnection is increased to cause difficulty in high integration.
In order to implement learning (self organization) of the neural network at a high speed, it is effective to correct the respective synapse loads in parallel in accordance with the aforementioned conversion expression (4): EQU .DELTA.Wij=.eta..multidot.(p.sup.+ ij-p.sup.- ij)
Thus, it is necessary to provide a circuit for calculating .DELTA.Wij and a synapse load control circuit for correcting synapse load information on the basis of the supplied .DELTA.Wij in correspondence to a circuit for representing each synapse load (see FIG. 6), and hence the circuit scale is increased to cause difficulty in high integration. In the structure shown in FIG. 6, the synapse load Wij is symmetrical and the Wij calculating circuit and the synapse load control circuit are used in common. In a practical scale, however, at least 1000 to 1500 neuron units are required. Therefore, it is necessary to extend the neural network by interconnecting a plurality of semiconductor neural network chips. However, a single chip providing synapse load representation may be in a state not satisfying the relation Wij=Wji. In this case, it is impossible to simply use a Wij calculating circuit and a synapse load control circuit for representing two synapse loads in common, dissimilarly to the above. Thus, the semiconductor neural network cannot be easily extended.
Further, only six neuron units have been illustrated in relation to the conventional semiconductor neural network, and no consideration has been made on a structure for extending the semiconductor neural network.
In the learning algorithm, on the other hand, it is necessary to implement stochastic state transitions of neurons while implementing controlled simulated annealing. However, it has been difficult to freely electrically control representation of such stochastic state transitions of the neurons and simulated annealing from the exterior, due to a number of restrictions. In the conventional Boltzmann machine, stochastic state transitions of neurons are generally represented with pseudo-random numbers in simulation in a von Neumann computer. In an electronic circuit device employing operational amplifiers such as those shown in FIGS. 3 and 5, each operational amplifier is used as a comparator, whose first input terminal is connected with a load sum input signal (dendrite signal line) IN while its input terminal is connected with a noise signal from a noise source (NS, AZ). This structure exhibits such a tendency that a high voltage appears at the output terminal of the operational amplifier when the voltage value of the load sum input signal exceeds a time average of noise voltages while a low voltage appears in other case. When a noise potential supplied from the noise source deviates in terms of the time average at a speed of change which is sufficiently faster than the change of the total load input signals and the comparator can follow the said speed of the change, the variation of a threshold value on a time base can be expected in a range corresponding to the width of the noise. In this case, the output of the comparator (operational amplifier) represents a stochastic transition state.
However, this structure requires control for reducing the noise width as a function of time in order to implement efficient simulated annealing. Although such control is generally achieved by controlling the gain of the operational amplifier AZ, it is difficult to derive a desired attenuating noise signal by such gain control. Further, it is difficult to achieve such gain control by high and low voltage signals of binary levels from an exterior to the device, while it is also difficult to generate a noise signal having desired time-dependent attenuation through such logical binary voltages.
When a noise generation source generating a thermal noise is employed, it is necessary to selectively extract a component in a frequency domain providing a variation speed which meets the object. Thus, an additional circuit is required for this and hence the circuit scale for generating simulated annealing is increased to increase the scale of the neural network.
In learning (self organization) of the Boltzmann machine, it is necessary to increase or decrease synapse loads. The up/down counter shown in FIGS. 6 and 7 is employed for increasing or decreasing the synapse loads. However, since it is necessary to repeat learning by tens of times in such a simple up/down counter, the contents of the up/down counter may overflow in learning. When such an overflow occurs in the up/down counter for setting the synapse loads, erroneous learning is made since the counter outputs a signal which is absolutely different from the learning value. Thus, it is impossible to correctly set the synapse loads.
On the other hand, the synapse polarity converting circuit for calculating a degree W.multidot.S of coupling between neurons requires four pairs of MOS transistors for providing excitatory coupling and inhibitory coupling in the structure shown in FIG. 4. Thus, the structure of the synapse load representing part is complicated and the circuit scale thereof is increased. Further, the data input lines IN and IN and the data output lines S and S are required for operating the products/sums, and hence the semiconductor neural network is prevented from high integration, as a matter of course.
Further, the current mirror type constant current circuit shown in FIG. 9 requires two current output signal lines, and hence the neural network is prevented from high integration.
When the current mirror type constant current circuit is employed, further, the amount of its driving current or an output current derived from the constant current circuit is determined depending on the sizes of transistors forming the constant current circuit. Thus, it is impossible to achieve extended connection for a number of synapse load representing parts (circuits for calculating Wij.multidot.Sj) which are connected to a single constant current circuit. In other words, it is impossible to drive a number of synapse load representing circuit parts by one constant current circuit. Thus, the neural network is increased in scale, while it is impossible to easily extend the neural network.
In the conventional Boltzmann machine, the learning rules are achieved in accordance with: EQU .DELTA.Wij=.eta..multidot.(P.sup.+ ij-P.sup.- ij)
It is necessary to calculate P.sup.+ ij and P.sup.- ij in this case, and such calculation is made in the correlation logic shown in FIG. 6. The correlation logic shown in FIG. 6 is required in order to calculate such P.sup.+ ij and P.sup.- ij as well as to output the variation Wij. In this correlation logic, it may be necessary to count numbers at which both of the signals Sj and Si are "1" and to obtain an average value of the numbers over a number of repeated times of actually performed simulated annealing. In order to form the correlation logic, therefore, the amount of hardware is increased and the calculation time is also increased.
Further, multibit operation is required for evaluating P.sup.+ ij-P.sup.- ij, and this multibit operation is performed through an up/down logic. Thus, the bit number of the counter circuit is increased and the circuit structure for calculating .DELTA.Wij is increased, to cause difficulty in high integration of the neural network.
In the conventional Boltzmann machine, it is necessary to perform the plus and minus phase operations respectively, in order to obtain the synapse load variation .DELTA.Wij. In the plus phase operation, an educator signal is supplied also to an output neuron, which is clamped in correspondence to the educator data. In the minus phase operation, on the other hand, the output neuron receives no educator data but remains in a free state. In learning of the neural network, therefore, it is necessary to make the educator signal valid/invalid in response to each phase, while definition of the attribute of each neuron (hidden neuron or visible (input/output) neuron) must be easily variable in extension of the neural network. Thus, the attribute of each neuron must be arbitrarily settable in the neural network, while a structure for easily transferring an educator signal to a set visible neuron is required in order to implement extended arrangement of the neural network.