1. Field of the Invention
The present invention relates to a parallel processing semiconductor integrated circuit device having a parallel operational function and a system employing the same, and more particularly, it relates to an integrated circuit device which can simulate a neural network and a system employing the same.
2. Description of the Background Art
There have been proposed various computational techniques which are modelled on vital nerve cells (neurons). In a data processing technique which is modelled on neurons, employed is a neural network including a plurality of neurons. In such data processing employing a neural network, data are asynchronously and parallelly processed among respective neurons upon supply of input data. Upon supply of a certain input data, the neural network generates such output data that the energy of the overall neural network system takes the minimum value. A computational technique employing such a neural network, which requires no algorithm for solving problems, is well-suited to solution of non-algorithmic problems such as pattern recognition and associative storage.
FIG. 28 illustrates a general neuron model. Referring to FIG. 28, a neuron unit Yj is coupled with four neuron units Y0, Y1, Y2 and Y3. This neuron unit Yj includes a synapse coupling parts SY0, SY1, SY2 and SY3 which receive output signals y0, y1, y2 and y3 from the neuron units Y0, Y1, Y2 and Y3 respectively, a conversion part CV which receives outputs from the synapse coupling parts SY0 to SY3 to carry out prescribed operational processing, and an output part OT which receives an output from the conversion part CV and carries out further conversion for generating an output signal yj.
The synapse coupling parts SY0 to SY4 have prescribed weights (synapse loads) with respect to the neuron units Y0 to Y3 respectively. Namely, the respective synapse coupling parts SY0 to SY3 weight the output signals y0, y1, y2 and y3 with weighing factors Wj0, Wj1, Wj2 and Wj3 respectively, to transmit the weighted signals to the conversion part CV. For example, the output signal y0 from the neuron unit Y0 is converted to a signal Wj0.y0 by the synapse coupling part SY0, to be then transmitted to the conversion part CV. The synapse loads Wj0 to Wj3 of the synapse coupling parts SY0 to SY3 indicate coupling strength levels between the neuron units Y0 to Y3 and the neuron unit Yj respectively. These loads Wj0 to Wj3 take positive values in the case of excitatory coupling, while the same take negative values in the case of inhibitory coupling.
The conversion part CV takes the sum of the signals received from the synapse coupling parts SY0 to SY3. The output part OT determines whether or not the sum received from the conversion part CV satisfies a certain condition. When the certain condition is satisfied, the output part OT fires to transmit the signal yj to an output signal line. As to correspondence to a vital brain cell, the synapse coupling parts SY of this neuron unit model correspond to dendrites and synapses, the conversion part CV and the output part OT correspond to a nerve cell body, and the output signal line corresponds to an axon.
In an electronic model, the signals y0 to y3 are expressed in numerical values within a range of 0 to 1 respectively. The neuron units are in firing states when such values are 1 or approximate to 1, while the former are in non-firing states when the latter are zero or approximate to zero. Each neuron unit updates its state (value of its output signal) in accordance with the input. The sum uj obtained by the conversion part CV is defined as follows: EQU uj=.SIGMA.Wji.yi+Wjj
The summation is carried out in relation to the subscript i. Wjj corresponds to the threshold value of the neuron unit Yj. This value is generally set at zero, as shown in FIG. 28.
The output part OT decides its state in accordance with the sum uj. When the sum uj is in excess of a certain threshold value, the neuron unit Yj fires and its output yj reaches 1 or a value approximate to 1. When the sum uj is below the threshold value, on the other hand, the neuron unit Yj enters a non-firing state and the output signal yj reaches zero or a value approximate to zero. In order to decide this state, the output part OT executes the following operation: EQU yj=f(uj)
A monotonously increasing nonlinear conversion function is employed for the function f(uj) which is applied to the conversion from the sum uj to the output signal yj. As to such a monotonously increasing nonlinear conversion function, well known is a sigmoid function as shown in FIG. 29, which is expressed as follows: EQU yj=1/(1+exp(-A(uj-B)))
where B represents a threshold value and A represents a value showing the width of a transient region. When the threshold value A is increased, the width of the transient region is reduced so that the function approaches a step function. Referring to FIG. 29, the axis of abscissas shows the sum uj and the axis of ordinates shows the output signal yj.
In general, a neuron is rarely independently employed. A neural network is implemented by mutually or hierarchically coupling two or more neuron units. Various proposals have been made as to a semiconductor integrated circuit (hereinafter simply referred to as a neurochip) which can express such a neural network.
FIG. 30 shows the structure of a conventional neurochip. The neurochip shown in FIG. 30 is structured on the basis of a technique which is described in IJCNN (International Joint Conference of Neural Networks), 1990, II, pp. 527 to 535 and 537 to 544, for example.
Referring to FIG. 30, the neurochip includes four neuron units 6a0, 6a1, 6a2 and 6a3. Each of the neuron units 6a0 to 6a3 includes a weight memory 1 for holding a synapse load value (value indicating strength of synapse coupling), a digital multiplier (MPY) 2 for multiplying a neuron state value received through a data bus 7 by the output value of the weight memory 1, an accumulator (ACC) 3 provided with a reset function for cumulatively adding up outputs from the multiplier 2, a nonlinear processor 4 for nonlinearly converting an output of the accumulator 3, and a bus driver 5 which is activated in response to one of control signals EN0 to EN3 to transmit an output of the nonlinear processor 4 to the data bus 7. The control signals EN0 to EN3 drive the respective bus drivers 5 of the neuron units 6a0 to 6a3. Therefore, the output of the nonlinear processor 4 is transmitted to the data bus 7 from one of the neuron units 6a0 to 6a3 at a time. The neuron units 6a0 to 6a3 are integrated on a single semiconductor chip, and a 4-bit address A&lt;3;0&gt; is supplied in common to the weight memories 1 of the neuron units 6a0 to 6a3. Further, a reset signal RESET is supplied in common to the accumulators 3 of the respective neuron units 6a0 to 6a3. The data bus 7 is coupled to eight data input/output pins D&lt;7;0&gt;.
In the structure shown in FIG. 30, each weight memory 1 has a structure of 16 words by 8 bits, while each multiplier 2 is a signed integer multiplier for carrying out multiplication of 8 by 8 bits and rounding the output to 12 bits. The output of each accumulator 3 is expanded in digit to 16 bits. Each nonlinear processor 4 compresses the 16-bit signal received from the accumulator 3 to 8 bits, in correspondence to the bus width of the data bus 7. The operation of the neurochip shown in FIG. 30 is now described upon simulation of a neural network shown in FIG. 31.
Referring to FIG. 31, the neural network has a three-layer structure of an input layer, an intermediate layer and an output layer. The input layer includes four neurons Ya0, Ya1, Ya2 and Ya3. The intermediate layer includes eight neurons Yb0, Yb1, Yb2, Yb3, Yb4, Yb5, Yb6 and Yb7. The output layer includes two neurons Yc0 and Yc1.
This neural network has a feed forward connection structure, so that signals are transmitted from the input layer to the output layer through the intermediate layer 32 (4 by 8) synapse couples in the intermediate layer and 16 (8 by 2) synapse couples in the output layer, or 48 synapse couples in total are provided. The operation is executed along the following order. In the following description, the reset signal RESET and the control signals EN0 to EN3 are in inactive states of logic "0" unless otherwisely stated.
(1) The reset signal RESET is generated in the form of a one-shot pulse (entering a state of "1" for a prescribed period), to reset contents of the accumulators 3 to zero in the neuron units 6a0 to 6a3. PA1 (2) The data pins D&lt;7;0&gt; are supplied with a value ya0 expressing the state of the input neuron Ya0 in an 8-digit signed binary number. PA1 (3) Address pins A&lt;3;0&gt; are supplied with an address (i=0) in parallel with the aforementioned input of the state signal. The weight memories 1 of the neuron units 6ak (k=0 to 3) output synapse load values Wbjai (j=k=0 to 3, i=0) respectively. Namely, synapse load values Wb0a0, Wb1a0, Wb2a0 and Wb3a0 are outputted. PA1 (4) In the respective neuron units 6a0 to 6a3, the multipliers 2 calculate the products Wbjai.yai (j=0 to 3, i=0) of the outputs Wbjai of the weight memories 1 and yai. PA1 (5) In the respective neuron units 6a0 to 6a3, the accumulators 3 add the results of multiplication received from the multipliers 3 to the holding values (in reset states of zero) thereof, and hold the results. PA1 (6) The aforementioned operations (2) to (4) are further repeated three times (four times in total). The number i is incremented one by one every repetition as 1, 2, 3. The address supplied to the address input terminals A&lt;3;0&gt; is also incremented one by one in a similar manner. PA1 (7) In the respective neuron units 6a0 to 6a3, the nonlinear processors 4 nonlinearly convert the values ubj which are held in the accumulators 3. Thus, states ybj (=f(ubj)) of neurons Ybj are obtained, where j=0 to 3. PA1 (8) The control signals EN0, EN1, EN2 and EN2 are sequentially driven to "1", for enabling the bus drivers 5 of the neuron units 6a0 to 6a3 in this order. The states ybj of the neuron units are transmitted onto the data bus 7 in the order of yb0, yb1, yb2 and yb3. The state signals ybj on the data bus 7 are stored in a memory device (not shown) provided in the exterior of the chip through the data input/output pins D&lt;7;0&gt;. PA1 (9) The aforementioned operations (1) to (8) are repeated with replacement of j=k=0 to 3 by j=k+4=4, 5, 6, 7. Thus, all output states of the intermediate layer neurons Yb0 to Yb7 are obtained. Processing as to the output neurons Yc0 and Yc1 is now described. PA1 (10) A one-shot pulse of logic "1" is applied to the reset pin RESET, to reset the contents held in the accumulators 3 of the neuron units 6a0 to 6a3 to zero. PA1 (11) The data input/output pins D&lt;7;0&gt; are supplied with states ybi (i=0) of the intermediate layer neurons Ybi (i=0) from an external memory device (not shown). PA1 (12) The address input pins A&lt;3;0&gt; are supplied with an address i+8 (i=0) in parallel with the above operation (11). The weight memories 1 of the neuron units 6ak (k=0 and 1) output synapse coupling values Wcjbi (j=k=0, 1, i=0) respectively. Namely, synapse load values Wc0b0 and Wc1b1 are outputted. PA1 (13) In the neuron units 6a0 and 6a1, the multipliers 2 calculate the products Wcjbi.ybi (j=k=0, 1, i=0) of the synapse load values Wcjbi received from the weight memories 1 and the neuron state values ybi. Thus, values of coupling between the neurons Yc0 and Yc1 and the intermediate layer neuron Yb0 are obtained. PA1 (14) In the neuron units 6a0 and 6a1, the accumulators 3 add the results of multiplication received from the multipliers 2 in the aforementioned operation (13) to the holding values thereof, and hold the results of such addition as new holding values. PA1 (15) The aforementioned operations (11), (12) and (13) are further repeated seven times (eight times in total). The number i is incremented one by one every repetition as 1, 2, 3, 4, 5, 6, 7. Thus, the accumulators 3 of the neuron units 6a0 and 6a1 hold the following values: EQU .rho.Wcjbi.ybi=ucj; j=k=1,1 PA1 (16) In the neuron units 6a0 and 6a1, the nonlinear processors 4 nonlinearly convert the holding values ucj of the accumulators 3 to obtain states ycj (=f(ucj)) of the neurons Ycj, where j=k=0, 1. PA1 (17) Signals of logic "1" are successively applied to control input pins Enk (k=0, 1). Thus, the state data ycj (j=k=0, 1) are transmitted to an external device such as the host calculator from the bus drivers 5 of the neuron units 6a0 and 6a1 through the data bus 7 and the data input/output pins D&lt;7;0&gt;.
Coupling between each of the neurons Yb0 to Yb3 and the neuron Ya0 is expressed by the aforementioned operations.
As the result, the accumulators 3 of the neuron units 6a0 to 6a3 hold the following values: EQU .rho.Wbjai.yai=ubj
The summation is carried out for i=0 to 3, and j=0 to 3, where j represents the intermediate layer neuron units.
The processing for the intermediate layer neurons Yb0 to Yb3 shown in FIG. 31 is completed by the aforementioned operations. Processing for the remaining intermediate layer neurons Yb4 to Yb7 is executed similarly to the above, as follows:
Thus, the sum of the inputs from the intermediate layer neurons Yb0 to Yb7 is obtained in each of the output layer neurons Yc0 and Yc1 shown in FIG. 31.
The operation of the neural network shown in FIG. 31 is completed by the aforementioned operations. In order to utilize the as-obtained results, it is necessary to transmit the results of calculation to a host calculator or the like which is provided in the exterior of the neurochip. In this transmission, the following operation (17) is executed:
During the aforementioned operations (10) to (16), the neuron units 6a2 and 6a3 carry out meaningless operations with no regard to the final outputs. Such neuron units carrying out meaningless operations are hereinafter referred to as idling neuron units.
In the aforementioned conventional neurochip, a plurality of neuron units operate in parallel with each other so that a plurality of operational processes can be executed in parallel. However, a single neuron unit can execute calculation of only single synapse coupling such as that between the neurons Yb0 and Ya0, for example, at a time, and hence it is impossible to execute processing at a high speed.
Further, communication between the neurochip and the exterior as to data such as neuron states can be executed only through a set of data bus and data input/output pins. Thus, the data transmission speed bottlenecks high-speed execution of the operational processing.
In addition, an idling neuron unit is caused when the number of subsequent stage neuron units receiving output states of common precedent stage neuron units is smaller than that of neuron units provided on the neurochip. When the number of neuron units provided on the neurochip is increased, therefore, it is difficult to reduce the processing time.
In order to increase the degree of parallelism of the neurochip, further, it is necessary to connect this neurochip in parallel. Therefore, internal loops of the respective processes, i.e., the respective neuron units, must be provided with the same numbers of not only multipliers and adders but also nonlinear processors which are rather infrequently used, leading to increase of the chip size.