1. Field of the Invention
The present invention relates to a nonlinear function generator for expressing nonlinear functions by an electronic circuit, and more particularly, it relates to a semiconductor nonlinear function generator for expressing nonlinear functions which are employed for pattern recognition/sorting and signal analysis by an electronic circuit. More specifically, it relates to a nonlinear function generator for implementing nonlinear threshold processing of neurons in a neural network by an electronic circuit.
2. Description of the Background Art
As hereinafter described, nonlinear functions are utilized in various fields of information processing such as pattern recognition and signal analysis. One of information processing techniques utilizing such nonlinear functions employs a neural network. The neural network is modelled on vital cells (neurons), and various computational techniques are proposed for employment in such a neural network. A neuron model which is employed in common for such computational techniques is characterized in a threshold processing function. Briefly stated, the threshold processing function is nonlinear conversion from an input to an output, which is known as an important function in information processing of a neural network. This threshold processing function is now described.
First, terms required for the following description are defined as follows:
Wij: Connection load value (synapse load value) indicating strength of connection (synapse connection) between j-th and i-th neurons. PA0 Si: Output of the i-th neuron. PA0 Uj: Total sum of weighted inputs in the j-th neuron, also called a film potential. PA0 f(): Nonlinear conversion function employed for producing an output from the film potential in each neuron.
FIG. 34 illustrates the structure and the operation principle of a general neuron model. Referring to FIG. 34, a neuron unit i includes an input part A receiving outputs (state signals) Sj, Sk, . . . , Sm from other neuron units j, k, . . . , m, a conversion part B (corresponding to a threshold function) for converting signals received from the input part A in accordance with a predetermined rule, and an output part C for outputting a signal received from the conversion part B.
The input part A has prescribed synapse loads W (synapse loads are hereinafter generically denoted by symbol W) with respect to the neuron units j, k, . . . , m, for weighting the output signals of the neuron units with the corresponding synapse loads and transmitting the same to the conversion part B. For example, the output signal Sk from the neuron unit k is multiplied with a synapse load value Wik by the input part A and converted to Wik.multidot.Sk, and thereafter transmitted to the conversion part B.
The conversion part B obtains the total sum of the signals received from the input part A, and fires when the total sum satisfies certain conditions, to transmit a signal to the output part C. In this neuron unit model, the input part A, the conversion part B and the output part C correspond to a dendrite, a cell body and an axon of a vital cell respectively.
In this neuron model, output states which can be implemented by each neuron vary with the model as applied. In a Hopfield model or a Boltzmann machine, it is assumed that each neuron unit enters two states, i.e., Si=0 (non-firing state) and Si=1 (firing state). In a perceptron model or the like, it is assumed that outputs of neuron units generally take continuous values in a range of 0 to 1.
Signal input/output relations in a neuron unit are generally expressed as follows: EQU Ui=.SIGMA.Wij.multidot.Sij+.theta.i EQU Si=f(Ui)
where .theta.i represents a self connection factor, and the total sum .SIGMA. is calculated as to all inputs in the i-th neuron unit. Therefore, it is interpreted that each neuron unit takes a sum of products of all inputs received therein with corresponding synapse loads and carries out threshold processing, i.e., nonlinear conversion processing of a film potential Ui obtained as the result, to generate an output. In other words, a neuron unit is an arithmetic unit having a multi-input, one output threshold processing function. It is considered that the nonlinear threshold processing in this neuron unit is one of factors by which a neural network implements a flexible information processing function of high quality.
While various types of nonlinear conversion are employed in conventional neural networks, saturated nonlinear functions are employed in most of the neural networks. A saturated nonlinear function implements such conversion that a nonlinear conversion output f(x) with respect to an input x is a&lt;f(x)&lt;b, where a and b represent constant values, in the overall variable area of the input x. Monotonous, non-decreasing functions are employed as nonlinear functions for a neural network, and a typical example of such a monotonous non-decreasing function is a sigmoid function, which is expressed as follows: ##EQU1## where T represents the temperature of the network, and 1/T is also called a slope of the function.
FIG. 35 illustrates the shape of a sigmoid function with a temperature T at 1. The sigmoid function loosely rises when the temperature T is increased. As clearly understood from the aforementioned expression of the sigmoid function and the shape shown in FIG. 35, the sigmoid function (conversion) f(x) is a saturated nonlinear function (conversion) having a value area of 0.0&lt;f(x)&lt;1.0. The value of this sigmoid function is extremely changed in a considerably limited variable area.
Various methods have heretofore been proposed as to implementation of a threshold processing function with an electronic circuit in the aforementioned neuron unit. Such conventional methods are roughly classified into those employing analog electronic circuits and those employing digital electronic circuits. Description is now made on methods of implementing threshold processing functions with digital electronic circuits, which are related to the present invention.
The methods of implementing nonlinear conversion with digital electronic circuits are roughly classified into two types. One is a technique of approximating nonlinear functions through series (function) expansion, and the other one is a function table reference method (lookup table method) of storing function values in a memory and employing the memory as a lookup table. The approximation technique employing series expansion is first described.
The approximation technique employing series expansion is a most general and accurate operating technique among the methods approximating series-expandable nonlinear functions. The aforementioned sigmoid function f(x) is approximated in proximity (h) to zero as follows, for example: ##EQU2##
Advantages of this technique is that it is possible to implement approximation in extremely high accuracy by increasing the number of approximation terms and that it is possible to carry out operations only with a function of taking sums of products regardless of the input data form such as integer representation, fixed-point representation or floating-point representation.
However, the approximation technique employing series expansion has such a disadvantage that a large number of processing steps are required for the operations. For example, approximation employing N terms requires (2N+1) operation steps at the minimum, due to requirement for operation steps for calculating values of the respective terms and total sums of these values with repetition of addition every two terms. When a nonlinear conversion circuit is formed in accordance with this approximation technique, therefore, a large number of clock cycles are required for nonlinear conversion processing and hence it is impossible to execute conversion processing (threshold conversion processing) at a high speed.
A structure of implementing sigmoid conversion (calculation of output values employing sigmoid functions) by an electronic circuit with series expansion is disclosed in "A VLSI Processor Architecture for a Back-Propagation Accelerator" by Hirose et al., Japanese Institute of Electronics, Informations, Communications Engineers, Transaction, Electron, Vol. E75-C, No. 10, October 1992, pp. 1223-1230, for example. A sigmoid function f(Ui) employed in this literature is expressed as follows: ##EQU3##
This sigmoid function f(Ui) is approximated through the following expressions: ##EQU4## where the coefficient Bn is a Bernoulli's number. In the aforementioned approximation expressions, the output f(Ui) is decided depending on a positive/negative value of the film potential Ui: EQU f(Ui)=g(Ui):Ui&gt;0 EQU f(Ui)=1-g(Ui):Ui.ltoreq.0
In series expansion, a variable area (range of values of input variables) effectuating approximation is extremely limited. Therefore, the approximation expressions as employed are varied with the variable areas, as described above. Thus, a plurality of approximation expressions are required for approximating a single sigmoid function to execute separate calculations in accordance with the variable areas, leading to increase in calculation cost.
The lookup table method is now described.
FIG. 36 schematically illustrates a structure for implementing the lookup table method. Referring to FIG. 36, a lookup table 982 which is formed by a memory stores all data to be referred to. An address decoder 981 performs necessary working on input values and generates addresses for the lookup table 982, for supplying the same to the lookup table 982. Data read from addressed positions of the lookup table 982 are outputted through an output buffer 983.
When nonlinear conversion is executed on an input value, a nonlinear function value is stored in a corresponding position of the memory with an address of a numeric value corresponding to the input value. The address decoder 981 performs necessary working such as truncation of a lower bit, for example, on this input value and generates an address. The lookup table 982 outputs the nonlinear function value corresponding to the input value, for output through the output buffer 982. The feature of this lookup table method resides in that it is possible to readily increase the processing speed as compared with the aforementioned method based on series expansion, since nonlinear conversion processing can be basically carried out in a time for reading data from the memory.
FIG. 37 illustrates the structure of a conventional neural network employing the aforementioned lookup table method. The structure shown in FIG. 37 is disclosed in "A Self-Learning Digital Neural Network Using Wafer-Scale LSI" by Yasunaga et al., IEEE Journal of Solid-State Circuits, Vol. 28, No. 2, February 1993, pp. 106-114, for example.
The neural network shown in FIG. 37 includes four neuron units 990, for example. Each neuron unit 990 has a synapse circuit 992 including a memory storing synapse load values for multiplying input values by corresponding synapse load values, and a neuron circuit 991 for obtaining the total sum of outputs of the synapse circuit 992 for calculating a film potential. The neuron circuits 991 of the neuron units 990 are connected to an input part of a conversion table 994 in common through a time sharing input data bus 993. An output part of the conversion table 994 is connected to the respective synapse circuits 992 of the neuron units 990 through a time division multiplexing output data bus 996. A control circuit 997 is adapted to generate operation timings in the neuron units 990 and addresses for the memories included in the synapse circuits 992.
The conversion table 994 stores output values with input values of functions employed for threshold conversion processing as addresses.
In operation, outputs of lower layer neuron units are successively transmitted onto the output data bus 996. The neuron units 990 execute sum of product operations on output data supplied onto the output data bus 997 in accordance with address signals and control signals received from the control circuit 996 through a control signal bus 995. When the sum of product operations are completed and the neuron units 990 calculate film potentials respectively, the outputs of the neuron units 990 are successively transmitted onto the time division multiplexing input data bus 993 under control by the control circuit 997, to be supplied to the conversion table 994. The conversion table 994 transmits values stored in corresponding positions onto the output data bus 996, with employing, as address signals, the outputs received on the input data bus 993. The conversion table 994 stores conversion function values, as described above. Thus, the outputs of the neuron units 990 are successively outputted onto the time division multiplexing output data bus 996 in accordance with the film potential data supplied onto the time division multiplexing input data bus 993 at a high speed. When the neuron units 990 also express those of other layers in the neural network shown in FIG. 37, the data outputted from the conversion table 994 are again transmitted onto the time division multiplexing output data bus 996 in series through the data bus 996. Thus, operations are executed in a next layer.
As shown in FIG. 37, the conversion table 994 outputs corresponding function values onto the output data bus 996 with the film potential data received on the time division multiplexing input data bus 993 as addresses. According to this structure, therefore, addresses for the conversion table 994 can be extremely readily generated so that it is possible to execute nonlinear conversion only in a time required for reading the data from the conversion table 994.
As described above, it is possible to execute nonlinear conversion processing at a high speed by employing the lookup table method. In this method, however, it is necessary to store all basically required data in the memory, and hence the occupied area of the conversion table 994 is increased to obstruct integration. When the conversion table 994 is arranged in an integrated circuit for structuring a neural network, the storage capacity cannot be sufficiently increased in consideration of the occupied area. Thus, it is difficult to store all conversion data and hence this method is inferior in conversion accuracy to the method employing series expansion.
In the lookup table method utilizing such a conversion table, further, the numeric data as employed are expressed in various forms, and numeric data of floating-point representation are generally employed when high operation accuracy is required. Description is now made on problems of the lookup table method which is applied to such numeric data in floating-point representation. Before the problems are explained specifically, numeric data representation systems, i.e., floating-point representation and fixed-point representation are described.
A numeric value of floating-point representation is expressed under a normalization format of IEEE as follows: EQU (-1).sup.S .multidot.(1.00+F).multidot.2.sup.E
where S represents a sign flag, E represents an exponent part (a characteristic), and F represents a mantissa. The mantissa F is expressed in the form of .SIGMA.a.sub.i .multidot.(1/2).sup.i, where the coefficient a.sub.i is a numeric value of 0 or 1.
FIG. 38 illustrates the structure of standard 32-bit (single precision) floating point numeric data. As shown in FIG. 38, a sign part 191 is formed by a 1-bit sign flag and an exponent part 192 expresses the exponent E in eight bits, and a mantissa 193 expresses the mantissa F in 23 bits.
FIG. 39 illustrates the structure of standard fixed-point numeric data. As shown in FIG. 39, the fixed-point numeric data is formed by a 1-bit sign flag forming a sign part 201, and a mantissa 202 formed by remaining 15 bits. A decimal point P can be set in an arbitrary position. However, the position of the decimal point P as set is thereafter fixed. A certain numeric value is now specifically expressed in floating-point representation and fixed-point representation. EQU 3=(-1).sup.0 .multidot.(1.00+0.50).multidot.2.sup.1,
and hence 32-bit floating-point representation is as follows: EQU 3=0 10000000 10000000000000000000000
On the other hand, 16-bit fixed-point representation is as follows, for example: EQU 3=000011.0000000000
In the floating-point representation, the minimum value--127 of the exponent part is made to correspond to "00000000", and the maximum value 128 is made to correspond to "11111111". Another form may be employed. The floating-point representation requires three variables of a 1-bit sign part, an 8-bit exponent part and a 23-bit mantissa, in order to express a single numeric value. On the other hand, the fixed-point representation requires two variables of a 1-bit sign part and a 15-bit mantissa.
The floating-point representation can express numeric values in a wide range. In the case of 32 bits, numeric values can be expressed in a range of 2.sup.-127 to 2.sup.128 in absolute values since the exponent part is formed by eight bits. Accuracy of a numeric value expressed in this range is extremely varied with the value of the exponent part. Consider 32-bit numeric data having the aforementioned format. When the exponent part has a constant 8-bit value E0, the following numeric values Z0 can be expressed: EQU 1.00.multidot.2.sup.E0 .ltoreq.Z0.ltoreq.(1.00+2.sup.-1 +2.sup.-2 + . . . 2.sup.-23).multidot.2.sup.E0
This value area includes 2.sup.23 numeric values, in expression accuracy (resolution) of 2.sup.E0 /2.sup.23. Namely, values of 2.sup.E0 to 2.sup.E0+1 are uniformly divided by the 23 bits of the mantissa in this value area, to express numeric values of this value area.
Namely, the resolution of the floating-point representation is varied with the value E0 of the exponent part, to be improved as the value E0 is decreased. When the value E0 is small, therefore, numeric values may be too finely expressed beyond necessity as compared with accuracy required by the system to which this representation is applied. In the floating-point representation, further, numeric operations may be extremely complicated since numeric expression is varied with the value of the exponent part, as described later again in detail.
In the fixed-point representation, on the other hand, the method of expressing numeric values is uniquely decided by decision of the maximum expressible value. Assuming that the maximum and minimum values in 16-bit representation are .+-.M0, the range between the maximum and minimum values M0 and -M0 is uniformly divided by 16 bits to express numeric values. Thus, the resolution is 2.multidot.M0/2.sup.16 =M0/2.sup.15. In this case, a range of expressible positive numeric values Z0 is as follows: EQU M0/2.sup.15 .ltoreq.0.ltoreq.M0
The fixed-point representation has such a disadvantage that the range of expressible numeric values is extremely limited, while the same has such an advantage that numeric values are regularly arranged on a number line at regular intervals of M0/2.sup.15, to extremely facilitate arithmetic operations (addition, subtraction, multiplication and division).
A method of forming a lookup table for sigmoid functions is now described.
A method of implementing conversion with nonlinear functions such as sigmoid functions by a lookup table can be implemented through saturation characteristics of the sigmoid functions. Namely, it is possible to implement a lookup table by limiting a range extremely changing function values, storing the function values in this range in a memory with addresses of input variables and reading necessary values from the memory at any time.
As shown in FIG. 35, for example, function values of the sigmoid function f(x) are extremely varied in a narrow area of about -8.ltoreq.x.ltoreq.8, and the same can be approximated to zero or 1 in other ranges. Thus, the function values in the range of -8.ltoreq.x.ltoreq.8 may be stored in the memory. In this case, error values of the function values with respect to input values which are out of the range are not more than 3.multidot.10.sup.-4, while such errors are not more than 1.multidot.10.sup.-7 assuming that the variable area of input variables is -16.ltoreq.x.ltoreq.16. Thus, it is understood that nonlinear conversion processing is not extremely influenced by restriction of the variable area. Thus, the feature of the saturation nonlinear functions such as sigmoid functions resides in that the variable area can be restricted.
Generation of memory addresses in this lookup table method is now described, first with reference to fixed-point representation.
In the fixed-point representation, numeric values are uniformly arranged in a range between maximum and minimum values at intervals decided by the bit number of the mantissa. Therefore, input variables themselves can be regarded as addresses, such that the film potential data on the input data bus 993 are expressed in the fixed-point representation and supplied to the conversion table 994 shown in FIG. 37.
Consider 16-bit numeric values in fixed-point representation. Assuming that the variable area of input values of nonlinear functions is -8.ltoreq.x.ltoreq.8 and the range 16 is expressed in 14 bits (14-bit accuracy), the input values are varied at intervals of about 0.0009765 from 16/2.sup.14, whereby 2.sup.14 (=16384) addresses can be generated. When function values are expressed in 16 bits, therefore, it is possible to implement a lookup table for nonlinear conversion by employing a memory which can store 2.sup.4 .multidot.2.sup.10 words (16K words), assuming that 1 word is formed by 16 bits. When the bit number of the mantissa is increased, the number of the addresses is also increased, to improve conversion accuracy. The feature of this method employing fixed-point representation resides in that addresses can be extremely readily generated since input values are uniformly digitized to be employed as addresses. However, an output value which corresponds to an input value between two adjacent addresses is rounded to an output value corresponding to any one of those provided on both sides of the input value.
Description is now made on address generation in a case of forming a lookup table through floating-point representation. A method of generating addresses of a lookup table from input values in floating-point representation requires more attention as compared with the aforementioned fixed-point representation.
First, description is made on how input values are expressed. The range -8.ltoreq.x.ltoreq.8 of input values described above with reference to the fixed-point representation is not uniformly divided in floating-point representation dissimilarly to the fixed-point representation, but divided as follows: EQU 4.ltoreq.x.ltoreq.8 (1)
As the exponent part E is constant at 2, this range is uniformly divided by 23 bits of the mantissa. EQU 2.ltoreq.x.ltoreq.4 (2)
As the exponent part E is constant at 1, this range is uniformly divided by 23 bits of the mantissa. EQU 1.ltoreq.x.ltoreq.2 (3)
As the exponent part E is constant at 0, this range is uniformly divided by 23 bits of the mantissa. EQU 0.5.ltoreq.x.ltoreq.1 (4)
As the exponent part E is constant at -1, this range is uniformly divided by 23 bits of the mantissa. EQU 0.25.ltoreq.x.ltoreq.0.5 (5)
As the exponent part E is constant at -2, this range is uniformly divided by 23 bits of the mantissa. EQU 0.125.ltoreq.x.ltoreq.0.25 (6)
As the exponent part E is constant at -3, this range is uniformly divided by 23 bits of the mantissa. EQU 0.0625.ltoreq.x.ltoreq.0.125 (7)
As the exponent part E is constant at -4, this range is uniformly divided by 23 bits of the mantissa. EQU (continues similarly to the above) (8)
As described above, the feature of the floating-point representation resides in that a region where the value of the exponent part E is constant is uniformly divided by 23 bits of the mantissa. This range is 2.sup.n-1 .ltoreq.x.ltoreq.2.sup.n (n: integer). Therefore, the width of this range is increased as absolute values of input values are increased so that the intervals between the numeric values in this range are also increased with roughened accuracy, while the accuracy is refined as the absolute values of the input variables are decreased. Namely, accuracy is varied with the input values if the input values are utilized as addresses as such, and hence it is impossible to carry out correct conversion.
Consider that memory addresses are directly generated from 16 bits in total among 1 bit of the sign part, 8 bits of the exponent part and 9 bits of the mantissa in the numeric values of 32-bit floating point representation, for the purpose of comparison with a case of generating memory addresses in accordance with numeric values of 16-bit fixed-point representation. It is assumed that upper 9 bits are selected as the 9 bits of the mantissa, and the range of input variables is -8.ltoreq.x.ltoreq.8. When the variable area is expressed in 14 bits among 16 bits of fixed points, the accuracy is about 0.0009765, as described above.
In the case of the floating-point representation, on the other hand, numeric values in a range of 4.ltoreq.x.ltoreq.8 are expressed in 9 bits of the mantissa, whereby 4/2.sup.9 =1/2.sup.7 and the accuracy is about 0.0078125. In a range of 0.125 .ltoreq.x.ltoreq.0.25, the accuracy is about 0.0002441 with 0.125/2.sup.9.
In the floating-point representation, therefore, the mantissa is inferior in accuracy to that of the fixed-point representation in a range having large numeric values, due to restriction of the bit number. In a range having small absolute values of input variables, on the other hand, the accuracy is improved as compared with that of the fixed-point representation, and this tendency is made remarkable as the absolute values of the input values are decreased. When the bit number of the mantissa is limited due to limitation of memory capacity in direct generation of addresses from numeric data of floating-point representation, the accuracy may be made inferior to that in the fixed-point representation depending on the variable area, and hence the advantage of high accuracy cannot be effectuated in the floating-point representation.
Now, consider that absolute values of input values are extremely small in direct generation of addresses from numeric data in floating-point representation. While it is possible to express small numbers up to that having an absolute value of 2.sup.-127 in floating-point representation, such small values are generally unnecessary in practice. When the sigmoid function f(x) has the shape shown in FIG. 35, for example, difference between function values with respect to x=2.sup.-15 and x=2.sup.-16 is about 0.0000038. Such small values are generally unnecessary as output values of nonlinear conversion required in application to a neural network or the like, in particular. Thus, most of input values (2.sup.-15 to 2.sup.-127 in this example) expressed in floating-point representation are too fine in accuracy beyond necessity, and employment of these values as input values for a lookup table is extremely inefficient since these values merely bring increase of unnecessary function values.
When input values of nonlinear functions are employed as addresses of a memory provided in a lookup table in an arithmetic system employing floating-point representation, accuracy of the mantissa is limited due to limitation in memory capacity in consideration of integration as hereinabove described. Thus, the input values are extremely ununiformalized in accuracy and the availability of the memory is reduced due to unnecessary function values stored therein while unnecessary function values are also loaded in data storage of the conversion table. Thus, it is impossible to form an efficient lookup table for nonlinear conversion. Further, sufficient accuracy cannot be guaranteed depending on the range of input variable values, and hence nonlinear conversion cannot be correctly carried out.
In application, it may be possible to further efficiently or accurately carry out processing by converting numeric data in floating-point representation to those of fixed-point representation or vice versa. However, there has been disclosed no structure of efficiently converting the numeric data form in the prior art. Such adjustment of converting numeric data of different forms is required in a multiprocessor system provided with a plurality of compatible processors handling numeric data of different forms, for example.