The present invention is directed to a neural network system for effecting storage, inference, pattern recognition, control, model estimation and approximation of functions by simulating neurons and connections therebetween.
FIG. 11 is an explanatory diagram illustrating an architecture of a conventional multilayer feedforward type neural network system reported, e.g., by David E, Rumelhart, Geoffery E. Hinton & Ronald J. Williams, in "Learning representations by back-propagating errors", Nature, Vol. 323, No. 9, pp. 533-536, October, 1986. In the same Figure, for instance, each of an input layer, an intermediate layer and an output layer is composed of a single layer. The input layer consists of three neural elements. The intermediate layer consists of four neural elements. The output layer consists of one neural element. Referring to FIG. 11, the numeral (11) represents an element (hereinafter referred to as a neural element) which simulates a neuron. The neural element is composed of an input layer (11a), an intermediate layer (11b), and an output layer (11c). Designated at (12) is an element (hereinafter referred to as a connecting element) which simulates a synapse by making an inter-layer connection of the neural element (11). The strength of this connection is known as a connecting weight.
In the thus configured neural network system, the neural elements (11) are layer-connected. As dynamics, an input signal coming from the input layer (11a) is, as indicated by an arrowhead A, propagated via the intermediate layer (11b) to the output layer (11c).
The following is a quantitative representation. Let V.sup.p.sub.li be the i-th value of the p-th learning data in the input layer (11a), let d.sub.kp be the k-th value of the p-th learning data in the output layer (11c), let U.sub.hj, V.sub.hj be the internal state and the output value of the j-th neural element of h-th layer, and let W.sub.hji be the connecting weight between the i-th neural element in the h-th layer and the j-th neural element in the (h+1)th layer. In this embodiment, h=1 in the input layer (11a), h=2 in the intermediate layer (11b), and h=3 in the output layer (11c). At this time, relations between the respective variables are expressed by formulae (1) and (2): ##EQU1## where the function g(*) may be a differentiable but non-reductive function. The formula (3) shows one example thereof. This function is shown in FIG. 12, with u being given on the axis of abscissa and g(u) being given on the axis of ordinate. ##EQU2##
Furthermore, the connecting weights W are sequentially determined according to a learning rule shown in the formula (4). More specifically, the weights W are sequentially determined by the steepest descent method associated with a squares-error defined by the learning data d.sub.lp (a desired signal) in the output layer and a value actually obtained by the neural network. The squares-error is expressed by the following formula (4): ##EQU3## where H (=3) is the number of layers of the neural network.
Besides, sequential variations in the connecting weight W is executable in conformity with the formula (5) by using the moment method: ##EQU4## where .alpha. and .beta. are appropriate parameters.
On the occasion of the actual use, the formula (5) is differentiated, and to describe the right side in much greater detail, learning is performed by adjusting the connecting weight W, while an output error is, as indicated by an arrowhead B, propagated from the output side to the input side. This is known as a back-propagation.
A distribution of an initial value of the connecting weight is usually expressed by a formula (6), uniform: ##EQU5## where W.sub.hjio is the initial value of the connecting weight, and p(W.sub.hjio) is the probability thereof.
Namely, the uniformly distributed initial value is set as a connecting weight, and repetitive arithmetic operations are performed fundamentally in accordance with the learning equation (4), defined as the steepest descent method so as to minimize the squares-error in the output layer (11c).
The prior art neural network system is configured in the manner discussed above. Hence, the inter-layer connecting weight is obtained by the repetitive arithmetic operations to minimize the squares-error in the output layer as a learning operation. This in turn presents the problem of requiring a good deal of time for learning. Namely, the problem is that a large amount of time is needed for convergence of the repetitive arithmetic operations in the learning algorithm. Besides, necessary numbers of the intermediate layers and of the neural elements in the respective layers are not known beforehand.
It is a primary object of the present invention, which has been devised to obviate the forgoing problems, to provide a neural network system capable of fast learning by speeding up convergence of repetitive arithmetic operations for learning and previously determining the number of neural elements of an intermediate layer.