1. Field of the Invention
The present invention relates to a neuro-chip for use in a neuro-computer or the like, and more particularly to a digital neuro-chip which helps to build an artificial neural network.
2. Description of the Related Art
Generally, neural networks are classified into two types, i.e., feedforward type represented by a multilayer perceptron model, and feedback type represented by Hopfield model.
The first type, i.e., the multilayer perceptron model will be described, with reference to FIG. 1. FIG. 1 shows how the neurons of the three layers of the multilayer perceptron model are connected. The first layer has m neurons; the second layer has n neurons, and the third layer has neurons. The mark "o" in FIG. 1 indicates each neuron.
The output of the i-th neuron of the second layer, for example, can be defined: ##EQU1## where I.sup.2 i: interim output of the i-th neuron of the second layer
Wij: synaptic weight of j-th neuron of the first layer and the i-th neuron of the second layer PA1 X.sup.1 j: output of the j-th neuron of the first layer PA1 f( ): a function transformation such as sigmoid function PA1 X.sup.2 i: output of the i-th neuron of the second layer PA1 d.sup.2 j: neuron error of j neuron in the second layer PA1 d.sup.3 j: neuron error of j neuron in the third layer PA1 yj: teaching signal for neuron j PA1 f'( ): differential of f( ) PA1 .eta., .alpha.: coefficients PA1 A. Simulation of a unit neuron by means of an analog circuit PA1 B. Simulation of a unit neuron by means of a digital circuit PA1 C. Solution of the above-mentioned equations by means of a computer, a microcomputer, a general-purpose DSP (digital Signal Processor), or the like
As can be understood from equation (1), a number of sum-of-product operations must be performed to find the interim output of each neuron. further, as is evident from equation (2), function f must be applied to the interim output, i.e., the sum of products thus obtained, in order to determine the output of the neuron of the second layer.
In the Hopfield model, too, the output of each neuron is determined in the same way as in the multi-layer perceptron model.
In recent years, various methods have been studied which may enable the neuro-chips, incorporated in a multilayer perceptrom model, to learn synaptic weights (hereinafter referred to as "weights"). The best known of these methods is backpropagation method. In this method, the neuron error made in the lowermost layer due to the output of a specified neuron is determined from the weight of all other neurons located above the specified one. Then, the weight Wij of the specified neuron is changed in accordance with the neuron error thus determined. The value .DELTA.Wij, by which the weight Wij is changed, is obtained as follows: EQU .DELTA.W.sub.ij .eta.d.sup.2 i.X.sup.i j+.alpha..W.sub.ij ( 3)
The neuron error d.sup.2 made in the second layer is defined as follows, if this layer is a hidden layer: ##EQU2##
The neuron error d.sup.2 j made in the second layer is expressed as follows, if the layer is the output layer: EQU d.sup.2 j=(yj-X.sup.2 j).f'(X.sup.2 j) (5)
In equations (4) and (5):
A greater part of the neuron-processing time is spent to obtain the sum of products defined by equation (1) or (4). To achieve high-speed neuron processing, it is of vital importance to make the sum of products changes, whereas in equation (4), i changes. Hence, when separate operation circuits are used to perform the calculations of equation (1) and (4), cares must be taken in order to shorten the calculation time.
The neuron processing, described above, can be accomplished by one of the following alternative methods:
FIG. 2 is a schematic representation of an analog circuit for carrying out the method A. This circuit comprises resistors 1 to 4, an operation circuit 4, and a sigmoid function generator 5. In this circuit, the resistances of resistors 1 to 4 determine the weights of neurons 1 to 4. It is therefore not easy to change the weights of the neurons.
FIG. 3 is a block diagram showing a digital circuit designed to perform the method B. The digital circuit comprises digital multipliers 6, digital address 7 and 8, and a sigmoid function generator 9, and designed to produce the output of a unit neuron. Since this is a digital circuit, it is easy to change the weights of neurons.
Obviously, analog element simulate the unit neuron shown in FIG. 2, and digital elements simulate a unit neuron illustrated in FIG. 3. Either unit neuron must be connected by signal lines to others. An enormously great number of signals lines are required to build a neural network having a large number of neurons. If this is the case, it is far from easy to alter the neuron connections. Neither the circuit of FIGS. 2 nor the circuit of FIG. 3 can be made compact by reducing the components. The less components, the lower the calculation accuracy of the neuron, and the less the learning ability thereof. Inevitably, either the neuron of FIG. 2 or the neuron of FIG. 3 occupies a relatively large area. When a great number of neurons of either type are used, forming a neural network, the network will occupy a large area. In other words, it is difficult to constitute a neural network which has a sufficiently large number of neurons and which is sufficiently compact.
The circuits shown in FIGS. 2 and 3 fail to have flexibility and learning ability which are required of them. Hence, it is the method C which has been employed in most neural network to accomplish neuron processing. Several general-purpose DSPs are used in the method C. FIG. 4 is a block diagram illustrating an apparatus having four DSPs 11 to 14. The apparatus further comprises a host computer 17, a dual port memory 18, four program memories 27 to 30, four dual port memories 31 to 34, and four data memories 35 to 38.
The host computer 17 supplies initial data and the like to the dual port memory 18 and DSP programs to the program memories 27 to 30, in response to an address signal 15 and a data signal 16. The dual port memory 18 supplies initial data (e.g., weight data) to, and receives the data from, the host computer 17. The program memories 27 to 30 stores the DSP programs supplied from the host computer 17, and supplies these programs to the DSPs 11 to 14 in response to address signals 19 to 22 and data signals 23 to 26--all supplied from the DSP 11 to 14. The dual port memories 31 to 34 and the data memories 35 to 38 are used to transfer data between any two adjacent DSPs. Each of the DSPs 11 to 14 reads a program from the associated program memory in response to an address signal and a data signal, and processes digital signals, independently of any other DSP. Each DSPs also receives data, which is required for processing the program data, from the dual port memories 18 and the associated dual port memory 31, 32, 33, or 34, designates an address signal 41, 42, 43 or 44 and a data signal 45, 46, 47 or 48, and writes data into the associated data memory 35, 36, 37 or 38.
The operation of the apparatus shown in FIG. 4 will now be explained. First, the DSPs 11 to 14 receives programs data and initial data (e.g., weight data) from the host computer 17. At the same time, part of the data from the host computer 17 is stored into the dual port memory 18. Each DSP reads the data stored in the upper dual port memory and writes it into the lower dual port memory. In other words, it supplies the data to the next DSP, as if performing baton-passing. The data items which the DSPs 11 to 14 need are stored into the data memories 35 to 38 coupled to the DSPs 11 to 14.
Then, the DSP 11 to 14 perform operations defined by equations (1) to (5). In order to use the memory capacities of the data memories 35 to 38, the same data is not stored in two or more memories. The weight data and the neuron data are divided into items, and these data items are stored into the memories 35 to 38. It will be described how the DSPs 11 to 14 perform the operation of equation (1).
As soon as the host computer 17 supplies the dual port memory 18 with the data input to the neural network, the DSP 11 reads the input data from the dual port memory 18 and supplies it to the next DSP 12 via the dual port memory 31. The DSP 12 supplies the input data to the next DSP 13 via the dual port memory 32. The DSP 13 supplies the input data to the next DSP 14 via the dual port memory 33. The DSP 14 supplies the input data to the DSP 11 via the dual port memory 34. More specifically, the DSPs 11 to 14 obtain a sum of products, i.e., an interim neuron output, in the following manner.
The DSP 11 obtains the first quarter of the interim output of a neuron, I.sup.2 i/4, and writes the data of I.sup.2 i/4 into the dual port memory 31, so that this data can be transferred to the next DSP 12. The DSP 12 obtains the second quarter of the interim neuron output, and reads the first quarter of the interim neuron output from the dual port memory 31, and writes the first and second data into the dual port memory 32, so that these data can be transferred to the next DSP 13. The DSP 13 obtains the third quarter of the interim neuron output, and reads the first and second quarters of the interim neuron output from the dual port memory 32 and writes these data of the interim neuron output into the dual port memory 33, so that these data can be transferred to the next DSP 14. The DSP 14 obtains the last quarter of the interim neuron output, and reads the first to third quarters of the interim neuron output from the dual port memory 33 and writes these data of the interim neuron output into the dual port memory 34.
Thus obtained is the interim output of one neuron. Since each layer of the multilayer perceptron model has n neurons, each DSP obtains sums of products for n/4 neurons at the same time. If m=2000, n=1000, the DSP 11 provides 250 sums of products, I.sup.2.sub.1 to I.sup.2.sub.250 ; the DSP 12 obtains 250 sums of products, I.sup.2.sub.251 to I.sup.2.sub.500 ; DSP 13 produces 250 sums of products, I.sup.2.sub.501 to I.sup.2.sub.750 ; and DSP 14 obtains 250 sums of products, I.sup.2.sub.751 to I.sup.2.sub.1000. The sums of products, I21 to I.sup.2.sub.250, are stored into the dual port memory 31; the sums of products, I.sup.2.sub.251 to I.sup.2.sub.500, are stored into the dual port memory 32; the sums of products, I.sup.2.sub.501 to I.sup.2.sub.750, are stored into the dual port memory 33; and the sums of products, I.sup.2.sub.751 to I.sup.2.sub.0000, are stored into the dual port memory 34. The DSP 11 processes neuron outputs x.sup.1.sub.1 to x.sup.1.sub.500 ; the DSP 12 processes neuron outputs x.sup.1.sub.501 to x.sup.1.sub.1000 ; the DSP 13 processes neuron outputs x.sup.1.sub.1001 to x.sup.1.sub.1500 ; and the DSP 14 processes neuron outputs x.sup.1.sub.1501 to x.sup.1.sub.2000. Each DSP performs operations to acquire n/4 sums of products.
Next, the DSPs 11 to 14 read the interim neuron outputs from the dual port memories 31 to 34. The DSP 11 obtains, for the second time, sums of products, I.sup.2.sub.751 to I.sup.2.sub.1000 ; the DSP 12 obtains sums of products, I.sup.2.sub.1 to I.sup.2.sub.250 ; DSP 13 produces 250 sums of products, I.sup.2.sub.251 to I.sup.2.sub.500 ; and DSP 14 obtains 250 sums of products, I.sup.2.sub.501 to I.sup.2.sub.750. At this time, too, the DSPs 11, 12, 13, and 14 process neuron outputs x.sup.1.sub.1 to x.sup.1.sub.500, neuron outputs x.sup.1.sub.501 to x1.sup.1.sub.000, neuron outputs x.sup.1.sub.1001 to x.sup.1.sub.1500, and neuron outputs x.sup.1.sub.1501 to x.sup.1.sub.2000, respectively. Further, the the DSPs 11 to 14 read the interim neuron outputs from the dual port memories 31 to 34, respectively, and obtain, for the third time, 250 sums of products each in a similar way. Finally, the the DSPs 11 to 14 read the interim neuron outputs from the dual port memories 31 to 34, respectively, and obtain, for the fourth time, 250 sums of products each in a similar way. As a result of this, interim neuron outputs I.sup.2.sub.1 to I.sup.2.sub.250, interim neuron outputs I.sup.2.sub.251 to I.sup.2.sub.500, interim neuron outputs I.sup.2.sub.501 to I.sup.2.sub.750 ; and interim neuron outputs I.sup.2.sub.751 to I.sup.2.sub.1000 are stored into the dual port memories 34, 31, 32, and 33, respectively.
Then, the function transform of equation (2) is performed on the interim neuron outputs thus obtained, thereby obtaining neuron output x.sup.1 i.
It has been pointed out, however, that the method C has the following drawbacks.
1. Since one dual port memory 18 is used to transferring data between the host computer 17 and the DSPs, thereby limiting the hardware, the input neuron data must be supplied to all DSPs via the dual port memories 31 to 33. Since data is last supplied to the DSP 14, it takes rather a long time to obtain neuron outputs. If the dual port memories have small capacity, hand-shaking of data transfer must be performed, which further increases the time required for acquiring neuron outputs. Further, since the DSPs simultaneously operate to get sums of products, any DSP cannot start operating to obtain next sum of products until the input data is transferred to all DSPs. Hence, the time required for transferring the input data causes an overhead to the neural network.
2. Although the programs for the general-purpose DSPs 11 to 14 are very similar, program memories 27 to 30 must be used for storing these programs. Consequently, lines need to be used for supplying address signals and data signals from the DSPs to the program memories, inevitably increasing the hardware. Further, each program memory must have a circuit which distinguish address signals and data signals supplied from the DSP from those supplied from the host computer 17.
3. Human brain has an enormously large number of neurons. A neural network having a comparable number of artificial neurons must have far more DSPs than four as in the apparatus of FIG. 4 to increase a processing speed. The greater the number of DSPs, the larger is the number of program memories and dual port memories which are required. Also, as has been pointed out in paragraph 1, Hence, it is practically difficult to build a large-scale neural network having a large number of DSPs. Much time is required to transfer input data.
4. Although the DSPs 11 to 14 perform almost the same operation to obtain a sum of products, they carry out different operations during transferring the input data at the same time. Hence, the DSPs need to execute different programs, and each DSP must receive and output data at the same timing as does any other DSP. Since the programs (i.e., parallel processing programs) for the DSPs 11 to 14 are different, it takes much time and labor to prepare these programs. The high cost of preparing DSP programs makes it difficult to incorporate more DSPs in the neural network.
5. It is desirable that a neural network, in particular a high-speed one, be made in the form of an LSI chip. Since the neural network of the type shown in FIG. 4 requires much hardware, it can hardly be manufactured in the form of an LSI chip. The fact that each general-purpose DSP has circuits not necessary for the neural network, for example an interruption circuit, makes it even more difficult to provide a neural-network LSI chip.
As has been described, the conventional method C has too many drawbacks to be applied to build an economical high-speed, large-scale neural network. Hence, there is a great demand for a new neuro-chip of the type which helps to manufacture a high-speed, large-scale neural network at sufficiently low cost.