1. Field of the Invention
The present invention relates to a neural network device widely used in like as recognition (e.g., character recognition or speech recognition), motion control (e.g., robot control), general process control, or neurocomputers.
2. Description of the Related Art
One general paper concerning a neural network is Teuvo Kohonen, "Representation of sensory information in self-organizing feature maps, and relation of these maps to distributed memory networks", SPIE Vol. 634, Optical and Hybrid Computing., pp. 248-259 (1986).
According to the above reference, one of the systems called neural networks is defined by the following three ordinary differential equations: EQU dy/dt=f(x,y,M,N) (1) EQU dM/dt=g(x,y,M) (2) EQU dN/dt=h(y,N) (3)
where x is a vector representing an input, y is a vector representing an output, M and N are parameter matrices, and f, g, and h are nonlinear functions.
An electronic circuit used in signal processing will be taken into consideration. The system parameter matrices M and N are represented by resistances, capacitances, and the like, and are, in most cases, constants. More specifically, since the respective elements of the matrices M and N in equations (2) and (3) are constants, the right-hand sides of equations (2) and (3) become zero.
In a system called a neural network, the system parameter matrices M and N are values which change as a function of time When the system parameter matrices M and N instantaneously change in accordance with equations (2) and (3), a normal storage function and a learning function based on the normal storage are realized. In addition, although storage and learning take a long period of time, an output can be obtained in response to an input within a very short period of time. In the above three equations, the matrices M and N change slowly as compared with the vectors x and y.
Another conventional neural network is defined by the following two ordinary differential equations (Reference: F. Fosenblatt, "The perceptron: a probabilistic model for information storage and organization in the brain", Psychological Review 65, pp. 386-408 (1958)): EQU dy/dt=f(x,y,M) (4) EQU dM/dt=g(x,T,M) (5)
where x is a vector representing an input, y is a vector representing an output, T is a vector representing a teacher signal, M is a system parameter matrix, f and g are nonlinear functions. In the system described by equations (4) and (5), in order to obtain a desired output in response to an input, a teacher signal is additionally used.
As described above, various modifications may be considered, but a neural network system is characterized that it has a storage function and a learning function based on storage.
One of the systems described by equations (4) and (5) will be described below in detail.
The system described by equations (4) and (5) is so-called perceptron which is a neural network system (see the above reference) having a layer structure, proposed by Rosenblatt et. al. in 1958. The perceptron having a large number of layers is called a multilayered perceptron. The structure of the multilayered perceptron is known to resemble synapses of neurons in a cerebellum or cerebral cortex. A mathematical analysis concerning an information processing capacity of a multilayered perceptron has been developed. For example, a perceptron having n(2n+1) neurons can express a continuous function having n arbitrary variables according to the theorem of Kolmogorov (Reference: Branko Soucek, "Neural and Concurrent Real-Time Systems", JOHN WILEY & SONS (The Sixth Generation), pp. 77-79).
A parameter of a multilayered perceptron is a value of connection weight between synapses of neurons. By updating said parameter in accordance with a differential equation called a learning equation, nonlinear adaptive network can be arranged.
In recent years, an error backward propagation learning algorithm is developed by Rumelhart et. al., and the above parameter can be obtained in accordance with a steepest descent method (Reference D. E. Rumelhart et al., "PARALLEL DISTRIBUTED PROCESSING; Exploration in the Microstructure of Cognition (Vol. 1: Foundations)", The MIT Press (1988), pp. 322-330).
The above multilayered perceptron will be described with reference to FIG. 1.
FIG. 1 shows a three-layered perceptron. The first layer is called an input layer consisting of S.sub.i (i=1, 2 , . . . , h) neurons. The second layer is called an intermediate layer consisting of A.sub.i (i=1, 2 , . . . , p) neurons. The third layer is called an output layer consisting of R.sub.i (i=1, 2 , . . . , m) neurons. A connection weight connected by m synapses is represented by RS.sub.ji and RA.sub.ji (i=1 , . . . , h; j=i , . . . , l). Assume that a signal propagates from the left to the right in FIG. 1. In addition, in all neurons, a relationship between the input and output signals in each neuron is assumed as projection according to a monotonous nonlinear function. It is also assumed that an input to the input layer of the neuron is externally supplied. Inputs to neurons except for those in the intermediate and output layers are subjected to the following weight sum operations: ##EQU1##
The above error backward propagation learning algorithm will be described below. The error backward propagation learning algorithm is a learning algorithm used when a teacher signal is supplied to only the last layer. When a projection result from an input signal x.sub.1 (a total sum of signals input from other neurons) of a neuron of interest of an arbitrary layer to an output signal from the neuron of the interest is represented by O, a connection weight R.sub.ji.sup.n+1 in a learning count n+1 is generally defined as follows: EQU R.sub.ij.sup.n+1 =R.sub.ji.sup.n +.rho..multidot..epsilon..sub.j .multidot.O.sub.j ( 8)
where .rho. is the relaxation coefficient. When the neuron is located in the output layer, .epsilon..sub.j is defined as follows: EQU .epsilon..sub.j =(t.sub.j -O.sub.j).multidot.F.sub.j ' (9)
where t.sub.j is the jth element of the teacher signal T.
When the neuron is not located in the output layer, .epsilon..sub.j is given as follows: ##EQU2## where F' is the first-order differential form of F by x. This algorithm is applied to the processing network.
The neural network is exemplified by the multi-layered perceptron in the above description. Conventional problems will be described below.
When a neural network device incorporating hardware which realizes equations (1) to (10) is assumed to be learnt in a given environment inaccessible to man, e.g., in a living body, there is no way to input the signal t.sub.j in equation (9) in the neural network device, and its application fields are therefore limited.
Equations (1) to (10) include arithmetic operations such as flexible long-term storable synapse function operations in addition to the four basic arithmetic operations. In addition, these equations include a nonlinear function in which the relationship between the input and output has a monotonously increased saturation characteristic curve, and a differential function of the nonlinear function. For these reasons, an extensive study is required to realize the arithmetic operations of equations (1) to (10) by a device or an electronic circuit.
The four basic arithmetic operations, the nonlinear function in which the relationship between the input and output has a monotonously increased saturation characteristic curve, and the differential function of the nonlinear function can be realized by conventional analog/digital electronic circuit techniques. The long-term programmable synapse function arithmetic operations can be realized by a conventional EEPROM and thin-film devices. It is, however, difficult to manufacture such a thin-film device because the thin-film device is incompatible with an LSI fabrication process.
In a conventional device, an .epsilon..sub.j value and a connection weight R.sub.ji.sup.n+1 are obtained by an analog circuit. An accurate .epsilon..sub.j value and an accurate connection weight R.sub.hi.sup.n+1 cannot be obtained by noise superposed on circuit signals, an offset, and a gain error, thus degrading the learning function.
As described above, the conventional neural network device has a narrow range of application fields, and a learning function cannot be effected as desired.
Conventional device techniques are reported in IEEE SPECTRUM Jan. (1991), pp. 52-55.