The present invention relates to a neural network which can be used, for example, for character recognition and, in particular, to a method of assigning initial values for learning the connection parameters of a neural network by utilizing statistical information on input data.
For example, as shown in FIG. 1, in a back propagation type neural network, synapse elements of an input layer, an intermediate layer and an output layer are generally connected in a network form. The connection between synapse elements is represented by synapse parameters (or a weight coefficient W and a bias .theta.). Adjusting these parameters to suitable values is called learning. If the network is used for character recognition, learning is achieved using learning data for character recognition. If it is used for picture image processing, learning is achieved using learning data for picture image processing; if such learning is sufficient to determine parameters appropriately and synapse parameters are set to these determined values, thereafter, this neural network performs character recognition, picture image processing and the like in accordance with the learning.
In a neural network, therefore, setting the above-mentioned synapse parameters suitably, i.e., setting efficiently and suitably, is very important.
In learning of a neural network, a back propagation method is generally performed. In the learning of a weight coefficient W and a bias .theta. of the synapse elements of this back propagation type neural network, regarding the weight coefficient W and bias .theta., correction is made to the weight coefficient W (n) and bias .theta.(n) beginning with initial values W(O) and .theta.(O) in the learning process. However, there are at present no theoretical grounds for preferring any particular set of initial values as the starting point for the learning of the weight coefficient W and bias .theta.. Under the existing circumstances, a small random number is generally used for each element as the initial value.
The problems regarding a weight coefficient that arise in a conventional learning method will be explained in detail.
In a back propagation algorithm, a weight coefficient w.sub.ij between two synapse elements i, j is successively corrected by using a correction amount .increment.w.sub.ij . That is, EQU w.sub.ij (n+1)=w.sub.ij (n)+.increment.w.sub.ij ( 1)
In the above equation, i denotes elements of the input side; j denotes elements of the output side. The correction amount .increment.w.sub.ij is calculated from the following equation. ##EQU1## where .eta. is a positive constant and E is an error of the entire network given by the following equation. ##EQU2## where t.sub.j is a teacher signal, and y.sub.j is the output signal of an output-side synapse element. .differential.g/.differential.w.sub.ij is calculated using outputs of each layer, but the explanation of a concrete calculation method is omitted.
The correction of a weight coefficient W.sub.ij (this step is called learning) is performed as described above and it is expected that the weight coefficient will converge on a suitable value through repetition of the learning steps. At this time, to make the value converge on a suitable value quickly, it is important to start with the right initial value w.sub.ij (O).
The learning of bias .theta.j for an element j is performed by means of a learning process similarly to w.sub.ij such that the bias .theta.j is regarded as a weight coefficient for input data that may be assigned a "1" at any time as a value.
As described above, since a back propagation type neural network is dependent on the initial values mentioned above regarding a weight coefficient and a bias in the learning process, the above-mentioned conventional technology, in which random numbers are used as initial values for a weight coefficient W={w.sub.i,j } and a bias .theta..dbd.{.theta..sub.j }, entails the possibility that (1) the learning time will be inordinately long and (2) a parameter will fall into a minimum value which is not optimum in the midway of learning and this minimum value will be erroneously taken as a suitable value.