The present invention generally relates to a neural network structure and a neural network learning method. The present invention is applicable to every device relating to information processing such as voice recognition, image processing, automatic translation, or associative storage.
An attempt to model a nervous system of a living thing and analyze its information processing mechanism was started by MuCulloch, Pitts 1/0, al. in 1943. The neutron model is a digital model which outputs an inpulse. After that, an analog model was studied in the light of the fact that in a sense system an intensity of stimulus and information are transmitted by frequency in appearance of inpulses and a relaxing potential (analog amount). A basic characteristic of analog neutron is expressed by spatial sum, non-linear type output function and threshold value, sometimes, by additionally using time integral. Rosenblart got a hint from a visual system and proposed a neural network of a layer structure having a state of 1/0, called "perceptron" in 1958. The layer structure is a scheme for expressing a basic structure of nervous system as studied in connection with a nervous system of the cerebellum (Marr, 1969). Particularly, it is being considered that the layer structure can express peripheral systems (a sense system, a motor system) well. After that, emphasis was put on an analysis of ability in view of mathematical engineering rather than relationship in view of physiology. Recently, learning ability of neural networks is being attracted, and various attempts to apply neural networks to recognition are being done.
Generally, a regular synthesis method is being employed as a method of generating voice. This method is such that voices uttered by persons are analyzed to find out rules of utterance, and voices are synthesized on the basis of the rules. The regular synthesis method has an advantage in that synthesized voices which form an arbitrary document (sentence) can be generated even by a small-scale system, and on the other hand, has a disadvantage in that complex rules are required to generate natural synthesized voices and further it becomes very difficult to extract general rules from the utterances. On the other hand, the use of neural networks enables it to be possible to simultaneously have a system learn acoustic parameters precisely representing features of voices actually uttered, and an environment within a sequence of input codes where the actual voices are placed. After learning, a sequence of codes is input in the system and voices are synthesized by the input codes.
Presently, a back propagation method is frequently used as a learning process for neural network. The back propagation method has an advantage that it makes it possible to learn weighting factors for neutron elements in a layer to which a target quantity is supplied, by using an amount of back propagation error (see Rumelhart, et al., "Parallel Distributed Processing", MIT Press, 1986).
Recently, it is becoming clear that the proposed back propagation method is very effective as a learning method for a multilayer perceptron. However, the back propagation method has the following disadvantages. First, it is possible to find out only a point which minimizes error. Once falling a local minimum, learning cannot advance. Second, the number of output layers increases with an increase in the number of links connecting adjacent layers and thus the network structure becomes complex. As a result, the ability of learning deteriorates.
An improvement of the back propagation method has been proposed, which is directed to compensating the aforementioned first disadvantage (see "COMPUTER TODAY", 1988/9, No. 27, pp. 54-59). However, there is room for improvement.