1. Field of the Invention
The present invention relates to a neural network apparatus and a learning method thereof.
2. Description of the Related Art
With the advance of microelectronics, an LSI capable of performing advanced signal processing at high speed has been realized. Such LSIs have been used for apparatuses in a variety of industrial fields.
In the fields of speech and character recognition and the fields of control for a system the dynamics of which are difficult to describe or a system which mainly depends on human experiences, some conventional sequential processing methods do not have sufficient capacities in practice.
In recent years, the use of signal processing using a neural network for simulating a parallel discrete processing method of a living body to replace (or compensate) the conventional sequential information processing method has been examined. A neural network LSI is desired in view of the apparatus scale and the processing speed. The neural network LSI can be used not only in an existing system but also in a variety of new application fields.
A capacity of the neural network depends on the network scale. For this reason, when a network is to be used in a variety of fields, a large-scale network must be realized on a small LSI chip area. To create a large-scale network on a limited chip size, the number of elements constituting each functional block of the network must be reduced. Reduction in the number of gradation levels of weight values in synapse connections is one method of reducing the number of elements.
In a back propagation method and a learning method realized by improving the back propagation method, however, the number of gradation levels (accuracy) of synapse connections required for learning is larger than the number of gradation levels required for obtaining a desired output after learning due to rounding errors in calculations (J. L. Hoit, J. N. Hwang: "Finite Precision Error Analysis of Neural Network Electronic Hardware Implementation", Proceedings of International Joint Conference on Neural Networks, Seattle, 1, pp. 519-525, IEEE (1991)). Therefore, even if a neural network learns with gradation levels the number of which is sufficient to obtain a desired output, satisfactory output characteristics cannot be obtained.
As described above, in a learning method in a conventional neural network apparatus, the number of gradation levels larger than that required at the end of learning is required during learning. Therefore, a large number of elements are required, and it is difficult to create a large-scale network with a limited hardware amount.
Another method of reducing the number of elements while preventing an increase in complexity of the circuit arrangement is a method of fixing output values from some previous-stage neurons to a predetermined value and using a synapse connection weight value between these neurons and the present stage as a threshold value without adding a new circuit for a neuron threshold value. This method is advantageous in that the synapse connection weight value and the threshold value can be updated to optimal values by learning.
In this method using some previous-stage neurons and the synapse connection weight value to obtain a threshold value, in a case of a hardware having a constraint for a maximum output amplitude, when the range of a summarizing or summation value net of inputs to the neurons is widened due to an increase in network scale, the entire range cannot be covered with the threshold value obtained by a small number of neurons and the synapse connection weight value in the above method, and an optimal value cannot be set. This is one factor which degrades learning convergence. To prevent degradation of such a learning convergence, a large number of neurons must be utilized for setting a threshold value. When a large number of neurons are used to set a threshold value, many neurons integrated on a limited chip area are wasted, and it is, therefore, difficult to realize a large-scale network.
In the conventional neural network apparatus as described above, the outputs from some previous-stage neurons are fixed to a predetermined value, and the synapse connection weight value between these neurons and the present stage is utilized as a threshold value so as to prevent an increase in complexity of the circuit arrangement. When the network size is increased, the learning convergence is degraded. To solve this problem, a large number of neurons are required, and neurons on a limited chip area cannot be effectively utilized. As compared with a network scale having the same number of neurons, the amount of hardware as an entire system increases. As a result, it is difficult to create a large-scale network.