This application is based upon Japanese Patent Application Nos. Hei. 11-148067 filed on May 27, 1999, and Hei. 11-328312 filed on Nov. 18, 1999, the contents of which are incorporated herein by reference.
1. Field of the Invention
This invention relates to signal processing apparatuses, and particular to a signal processing apparatus such as a neural computer applied in character and graphic recognition, associated storage, multi-input/output non-linear mapping and the like.
2. Related Art
Hitherto, there has been known a neural network (neural cell circuit network) modeled after information processing carried out in a living body. In the neural network, a neuron is a unit of function and information processing is carried out by disposing a plurality of neurons in network. Such neural network is suitable for information processing of character and graphic recognition, associated storage, multi-input/output non-linear mapping and the like which can be hardly achieved by the conventional Neumann type computers.
Next, the neural network will be explained to facilitate understanding of the present invention.
At first, the schematic structure of the neural network will be explained.
The neural network is configured by disposing neurons in network as described above as shown in FIG. 15 for example.
The neural network shown in FIG. 15 is called as a three-layered hierarchical neural network and comprises an input layer, an intermediate layer (hidden layer) and an output layer.
It is noted that a signal is inputted from the input layer and is outputted from the output layer as it propagates sequentially through the intermediate layer and the output layer. The input layer only propagates the input signal to the intermediate layer and carries out no arithmetic operation like the intermediate layer and the output layer as is known in the technological field. Therefore, the functional unit composing the intermediate and output layers is referred to as a neuron. The intermediate and output layers contain at least one neuron, respectively.
The input layer is coupled with the respective neurons of the intermediate layer and the respective neurons of the intermediate layer are coupled with the respective neurons of the output layer as shown in FIG. 15. Then, the signal inputted to the input layer of the neural network is propagated to the intermediate layer to undergo predetermined arithmetic operations within the neurons contained in the intermediate layer as described later. Its output value is propagated further to the output layer. Similar arithmetic operations are carried also in the neurons contained in the output layer and its output value becomes the final output of the network.
This series of operations is the information processing of the neural network called a sequential propagation (forward processing) and allows input/output to be realized arbitrarily when a sufficient number of neurons are contained in the intermediate layer.
It is noted that although the neural network shown in FIG. 15 is a three-layered structure network having one intermediate layer, there has been proposed a network having two or more intermediate layers.
The neuron which is the structural unit of the neural network will be explained next.
FIG. 16 is a diagrammatic view of the j-th neuron denoted by a symbol j in FIG. 15. The neuron is composed of an input section for inputting input values from the outside, a computing section for computing those input values and an output section for outputting the result of computation.
When each input value from the outside is expressed as xi (i=1, 2, 3, . . . , n), the computing section multiplies a corresponding link weight wji (i=1, 2, 3, . . . , n) with each input value xi and calculates their sum yj as shown by the following expression (1):
yj=xcexa3wjixixe2x80x83xe2x80x83(1)
It is noted that the symbol xcexa3 is a symbol of sum of i. The link weight wji indicates the strength of coupling between the neurons. The wji indicates the link weight between the j-th neuron and the i-th neuron.
The computing section executes a non-linear computation f to the sum yj found as described above to output an output value zj as expressed by the following expression (2):
zj=f(yi)xe2x80x83xe2x80x83(2)
A Sigmoid function is often used as the non-linear function f because it expresses a differential value fxe2x80x2 of the non-linear function f which is required in realizing a learning function described next by using the non-linear function f itself like fxe2x80x2=fxc2x7(1xe2x88x92f) and allows an amount of computation to be reduced. A step function is used as the non-linear function f in some cases. However, the non-linear function is not limited to those functions and may be a monotonous increment function having a saturating characteristic.
The neural network having such neurons as the structural unit is characterized in that it has the learning function. This learning function will be explained next.
The learning in the neural network is realized by updating the link weight of each neuron described above. That is, it enables to obtain a desired output signal z from the output layer when the value of link weight w is updated sequentially to an adequate value and a set of input signals (input pattern) p is given to the input layer.
In executing learning, a desired output signal t corresponding to the input signal p is given together with the input signal p. This output signal t is referred to as a teacher signal. A learning method called back propagation (BP) algorithm is used in the hierarchical neural network shown in FIG. 15.
The back propagation (BP) will be explained concretely.
When a certain input signal p is given, a square error of an output value zk of the neuron of the k-th output layer and a teacher signal value tk is defined by the following expression:
Ek=(tkxe2x88x92zk)2/2xe2x80x83xe2x80x83(3)
In the learning, the degree of all link weights is updated so as to reduce this square error Ek.
When the link weight Wkj of the j-th neuron in the intermediate layer and the k-th neuron in the output layer is updated, the square error Ek varies as follows:
∂Ek/∂wkj=xe2x88x92(tkxe2x88x92zk)xc2x7fxe2x80x2(yk)xc2x7zjxe2x80x83xe2x80x83(4)
Here, zj is an output of the j-th neuron of the intermediate layer and fxe2x80x2 is differential of non-linear function of the neuron. Yk is an input sum expressed by the expression (1) described above about the k-th neuron of the output layer.
When the link weight wji of the i-th neuron in the input layer and the j-th neuron in the intermediate layer is updated, the square error Ek varies as follows:
xe2x80x83∂Ek/∂wji=xe2x88x92{xcexa3(tkxe2x88x92zk)xc2x7fxe2x80x2(yk)xc2x7wkj}xc2x7fxe2x80x2(yj)xc2x7zixe2x80x83xe2x80x83(5)
Here, the symbol xcexa3 is a symbol of sum about k. zi is an output of the i-th neuron of the input layer and yj is an input sum expressed by the expression (1) described above about the j-th neuron of the intermediate layer.
Accordingly, the amounts of update xcex94wkj and xcex94wji of the link weights for reducing the square error may be expressed by the following expressions:
xcex94wkj=wkj(t+1)xe2x88x92wkj(t)=xe2x88x92xcex7xc2x7∂Ek/∂wkjxe2x80x83xe2x80x83(6)
xcex94wji=wji(t+1)xe2x88x92wji(t)=xe2x88x92xcex7xc2x7∂Ek/∂wjixe2x80x83xe2x80x83(7)
It is noted that t indicates time here. The xcex7 is a positive number called an update weight and is normally determined experimentally in a range of 0.1 to 0.5.
While the outline of the neural network has been explained above in detail, how to realize the function of the neuron described above is questioned in configuring the neural network.
Hitherto, the method of realizing the function of neuron by processing in software by using the Neumann type computer has been used often. However, the original parallel information processing is not carried out in this case because a CPU executes processes in the plurality of neurons in a time division manner.
Therefore, there has been proposed a technology for configuring neurons by using hardware. The method for realizing it by using hardware is divided roughly into two methods of using an analog circuit and of using a digital circuit.
The use of the analog circuit is advantageous in that the integration of neural network may be increased and signal processing speed may be increased because the area of the neuron may be reduced. However, because it expresses a value of each signal by scale of analog such as potential and current and each arithmetic operation is executed by analog elements such as an amplifier, variation due to temperature characteristics and variation of processes in forming elements are questioned. As a result, it has had disadvantages that the response characteristics of each element cannot be unified and no desired output value can be obtained.
Meanwhile, when the neuron is constructed by using the digital circuit, although it has had a disadvantage that the integration of the neural network cannot be increased because the area of the neuron is large as compared to that of the analog circuit, it has had advantages that its reliability is high because it is not affected by the variation of the temperature characteristics and of the processes in forming the element and the circuit may be relatively readily formed.
As the technology for constructing the neuron by using the digital circuit, there has been one disclosed in Japanese Patent Laid-Open No. Hei. 7-114524. In this technology, a concept of pulse density has been adopted in realizing the function of the neuron by the digital circuit. However, there has been the following problem when the pulse density is used.
That is, while there are excitative and suppressive couplings in the coupling between the neurons of the neural network and they are expressed by the positive and negative reference marks of the coupling function mathematically, they cannot be distinguished when the pulse density is used. That is, although the technology in Japanese Patent Laid-Open No. Hei. 7-114524 is arranged so as to be able to express xe2x80x9c0 to 1xe2x80x9d by the pulse density, it is necessary to express signals corresponding to xe2x80x9cxe2x88x921to 1xe2x80x9d in order to express the coupling between the neurons of the neural network. Therefore, this technology divides the respective couplings into two groups of excitative coupling and suppressive coupling by the plus and minus of the link weight. As a result, two systems of signal lines from a synapse circuit to nerve cell circuit defined in Japanese Patent Laid-Open No. Hei. 7-114524 are required.
The present invention has been devised to solve the above-mentioned problems and its object is to reduce a circuit area of a neural network.
Its second object is to contribute to the reduction of circuit area of a neural network by enabling the excitative and suppressive couplings to be expressed by one signal by adopting a concept of pulse delay time, not the pulse density, for a signal processed by the neurons in digitally configuring the neural network.
According to the present invention, a neuron is a structural unit of a hierarchical neural network realized as a digital electronic circuit. This neuron is modeled after a neuron of a living body.
A pulse train composed of a predetermined number of pulses is inputted to the neuron from the outside as an input signal. A delay time of each pulse with in this pulse train follows a normal distribution of average x. The delay time of each pulse is a delay time from a corresponding reference pulse within a reference pulse train.
FIG. 2 shows a pulse train x1 as an input signal (input 1) for example. The pulse train x1 is composed of m (predetermined number) pulses s11, s112, . . . s1m. Then, the delay time d11, d12 and d1m of each pulse to the reference pulses T1, T2, . . . , Tm within the reference pulse train follow the normal distribution of average x1.
It is noted that the xe2x80x9cdelay timexe2x80x9d here may take a minus value. For instance, d1k(k=1, 2, 3, . . . , m) takes a minus value when a pulse s1k of the pulse train x1 precedes the corresponding reference pulse Tk.
Although this reference pulse is considered to have intervals of fixed time, the reference pulse needs not to have intervals of fixed time. Although it is possible to arrange so that the reference pulse is generated on the outside of this neuron and is inputted to the neuron, the neuron may be provided with reference pulse generating means to generate the reference pulse within the neuron.
It is noted that the xe2x80x9cpulse delay timexe2x80x9d simply mentioned in the following explanation refers to the delay time from the corresponding reference pulse as described above.
When a pulse train as the input signal having such characteristics is inputted, the neuron of the present invention operates as follows.
At first, multiplication corresponding value calculating means finds a multiplication corresponding value following the normal distribution of average wx by using link weights w corresponding respectively to pulse trains as the input signal. Then, adding means adds the multiplication corresponding values found by the multiplication corresponding value calculating means with respect to each of the pulse trains. Here, the adding means adds them by taking outxe2x80x94multiplication corresponding values from a set of each multiplication corresponding value so that they do not overlap. For instance, if the multiplication corresponding value can be found in time series manner by the multiplication corresponding value calculating means, the adding means adds values of the same string in the multiplication corresponding value string in the time series.
Accordingly, when m pulse trains are inputted as an input signal and those pulse trains are what the delay time of each pulse follows the normal distribution of the average xi (i=1, 2, 3, . . . , m: the same applies hereinafter), the distribution of the added values of the adding means follows the normal distribution of the average xcexa3wixi, where wi is the link weight corresponding to each pulse train.
The arithmetic operation of the multiplication corresponding value calculating means and the adding means corresponds to the arithmetic operation shown in the expression (1).
Non-linear operating means counts a number of positive values within the added values obtained by the adding means. The positive value here may or may not include xe2x80x9c0xe2x80x9d. That is, whether to include the boundary value xe2x80x9c0xe2x80x9d is barely influential from the point of view of the whole counted value. This number is what the probability density of the normal distribution of the average xcexa3wixi is integrated about the part whose added value (computed value of the delay time of each pulse) becomes a positive value. Accordingly, it causes non-linearity. The arithmetic operation by means of this non-linear arithmetic operating means corresponds to the operation shown in the expression (2).
Then, pulse train generating means generates a pulse train composed of a predetermined number of pulses. The delay time of each pulse of this pulse train follows the normal distribution wherein the delay time determined based on the number of positive values counted by the non-linear arithmetic operating means is an average value.
According to the invention, it is needless to say that the use of the digital circuit is advantageous in that it allows complete parallel processing to be achieved, a circuit to be formed relatively easily without being influenced by temperature characteristics and by the variation of processes in forming devices and allows a high reliability to be obtained.