Neural networks have been applied to numerous classification problems involving large input data vectors. In many cases the practical, rather than research, applications have been limited by the availability of hardware suitable to the implementation of real-time on-line classification systems such as for speech recognition.
Most prior attempts at the practical implementation of classification of visual and speech signal data vectors have been based on the use of perceptrons or McCullough-Pitts (M-P) type artificial neurons organized into a multilayer neural network.
FIG. 1 shows a typical M-P type neuron having N input terminals 11 for accepting an input vector, X=[x.sub.1, x.sub.2 . . . x.sub.N ].sup.T. The N elements of vector X are each applied to a weighting element 12 that forms product terms, {w.sub.i, x.sub.i }, representing the product of each input vector element, x.sub.i, and a corresponding element, w.sub.i, of the weighting vector, W=[w.sub.1, w.sub.2 . . . w.sub.N ].sup.T. These product terms are summed by adder 13 to form the vector-dot product ##EQU1##
This vector-dot product is proportional to the cosine of the angle between vector, or ##EQU2## where .vertline.X.vertline. and .vertline.W.vertline. are the magnitude (norms) of vectors X and W respectively. Thus, for a given data vector, X, and reference weighting vector, W, the dot product X.sup.T.W is maximum when the angle between the two vectors, .THETA.=0, and minimum when .THETA.=90.degree.. Equation (3) indicates that the normalized dot-product is maximum positive (cos .THETA.=1) when the two vectors are proportional (i.e., identical except for scale) and maximum negative (cos .THETA.=-1) when the two vectors are proportional but "pointing" in opposite directions (.THETA.=180.degree.). Thus, the output of adder 13 is a measure of the angular separation of vectors X and W.
In addition, an offset signal, x.sub.0, may also be applied to adder 13 through offset terminal 14. This offset may be combined with the input vector by defining X=[x.sub.0, x.sub.1, x.sub.2 . . . x.sub.N ].sup.T and W=[w.sub.0, w.sub.1, w.sub.2 . . . w.sub.N ].sup.T, where w.sub.0 =1, so that the combined output of adder 13 is ##EQU3## which is another vector dot-product. Thus, if normalized input and weighting vectors, X/.vertline.X.vertline. and W/.vertline.W.vertline., were used, the output of adder 13 would be representative of cos .THETA..
The output of adder 13 is applied to an output nonlinearity (or squashing function) 15 having saturating characteristics such as shown in FIG. 2. FIG. 2(a) shows a typical sigmoidal nonlinear transfer characteristic while FIG. 2(b) shows a hard limiting form of the sigmoidal function, i.e., a signurn function, the latter producing a binary output in response to the input data vector and threshold signal.
The M-P type of neuron classifies by using hyperplanes for separating the vector space so that each neuron accepting an N-dimensional data vector, X, divides the N-dimensional vector space into two pans by means of the hyperplane which is defined as follows: EQU x.sub.0 +w.sub.1 x.sub.1 +w.sub.2 x.sub.2 +. . .+w.sub.N x.sub.N =0(5)
For purposes of explanation, consider the two-dimensional case (N=2). Equation (5) becomes ##EQU4## FIG. 3 is a plot of equation (6). In this case the hyperplane, x.sub.0 +w.sub.1 x.sub.1 +w.sub.2 x.sub.2 =0, has degenerated into a line that separates the (x.sub.1, x.sub.2) plane into two regions. When w.sub.1 x.sub.1 +w.sub.2 x.sub.2 &gt;-x.sub.0, the output is positive (high) and corresponds to region 1 of the plane, and when w.sub.1 x.sub.1 +w.sub.2 x.sub.2 &lt;-x.sub.0, the output of the M-P neuron is negative (low). Thus, the value of the threshold, x.sub.0, together with the weights, w.sub.1 and w.sub.2, determine the x.sub.1 and x.sub.2 intercepts of the two-dimensional hyperplane.
In order to define a closed region 3 within the (x.sub.1, x.sub.2) plane, at least three hyperplanes are needed, as shown in FIG. 4. In general, N+1 hyperplanes are required to define a closed region within an N-dimensional vector space.
This implies that the hidden layer of an N-dimensional classifier system requires at least N+1 M-P type neurons, in the most general case, in order to define a single closed region of N-dimensional vector space. The recognition of M classes in the N-dimensional space requires an additional layer of M output neurons. FIG. 5 shows an example of an M-P type neural network classifier for N=4 and M=3. If the three classes correspond to three distinct closed regions, as many as 3(N+1), M-P neurons may be required.
A distance or difference type neuron (d-neuron) is more effective in defining a closed region of N-space. A typical d-neuron, as shown in FIG. 6, represents the basic type of neuron used in the present invention. Input terminals 21 accept the input data vector, U=[u.sub.1, u.sub.2 . . . u.sub.N ].sup.T, that is to be classified. Input terminals 26 are for accepting a candidate prototype vector, P=[p.sub.1, p.sub.2 . . . p.sub.N ].sup.T, which, together with input data vector U, is applied term by corresponding term to the N distance metric units 22 where a set distance terms, {d (u.sub.i, p.sub.ji)}, are formed at the output.
Typical distance metrics have the form of the Minkowski norm, D.sub.q (U, P.sub.j), is given by ##EQU5##
FIG. 7(a) illustrates the N=2 first order (q=1) Minkowski distance metric, also known as the "city block" distance metric, where the distance D.sub.1 (U, P.sub.j) is the sum of the absolute differences .vertline.u.sub.2 -p.sub.j2 .vertline. and .vertline.u.sub.1 -p.sub.j1 .vertline.. The perimeter of the diamond-shaped area centered about (p.sub.j1, p.sub.j2) represents the locus of all points that are the same "city block" distance, D.sub.1, from (p.sub.j1, p.sub.j2). FIG. 7(b) illustrates the q=2 or quadratic Minkowski (Euclidean) distance metric D.sub.2 (U, P.sub.j) together with the locus of points that are the same Euclidean distance, D.sub.2, from (p.sub.j1, p.sub.j2). Other positive integer or fractional values of q are also suitable for forming a Minkowski distance metric. The Minkowski metric defines a generalized q.sup.th order hyperspheroidal closed surface of constant radius.
Referring back to FIG. 6, the set of terms, {d.sub.q (u.sub.i, p.sub.ji)}, are summed by adder 23 to form the complete distance norm D.sub.q (U, P.sub.j) at its output. Radial basis function (RBF) generator 25 accepts the distance metric from adder 23 and produces a nonlinearly transformed output.
Typical nonlinear transfer characteristics that may be representative of RBF generator 25 is shown in FIG. 8. Note that because D.sub.q .gtoreq.0, the nonlinearity is only defined for D.gtoreq.0. FIG. 8(a) produces a value, y=1, if the distance metric D&lt;D, otherwise y=0. This indicates that when U is close to the prototype vector P.sub.j, D&lt;D causing the output to be high. FIG. 8(b) shows an exponentially decaying transfer characteristic y=e.sup.-D/D. FIG. 8(c) is a gaussian-like half bell of the form, y=e.sup.-1/2(D/D).spsp.2. These transfer characteristics are useful for representing probabilities based on the distance metric, D.
One significant property of the d-neuron of FIG. 6 is that the output of a single d-neuron may be equivalent to an N-input, 1-output, two-layer neural network of M-P type neurons because each N-input d-neuron defines a closed region in N-space. For example, consider the Euclidean or second order Minkowski norm in two-dimensional (N=2) space. EQU D.sub.2 (U,P.sub.j)=[.vertline.u.sub.1 -p.sub.j1 .vertline..sup.2 +.vertline.u.sub.2 -p.sub.j2 .vertline..sup.2 ].sup.1/2 ( 8)
This corresponds to the circular locus defined in FIG. 7(b) where D.sub.2 (U, P.sub.j) is the radius and (p.sub.j1, p.sub.j2) is the center and encloses the area bounded by the locus. In a similar fashion, the three-dimensional Euclidean norm becomes EQU D.sub.2 (U,P.sub.j)=[.vertline.u.sub.1 -p.sub.j1 .vertline..sup.2 +.vertline.u.sub.2 -p.sub.j2 .vertline..sup.2 +.vertline.u.sub.3 -p.sub.j3 .vertline..sup.2 ].sup.1/2 ( 9)
which defines sphere with radius D.sub.2 (U, P.sub.j). In general, ##EQU6## defines a hyperspheroid in N-space. Hence, a single d-neuron is capable of defining a closed hyperspheroidal region while an M-P neuron is only capable of defining a single hyperplane that divides an N-space. This further implies that a neural network of M-P neurons with N+1 neurons in the hidden layer, and one neuron in the output layer is required to define an N-space closed region.
The q.sup.th order Minkowski norm (equation (7)) may be considered as defining a generalized q.sup.th order hyperspheroid with radius D.sub.q (U, P.sub.j). Thus, the square locus in FIG. 6(a) is a first order, two-space hyperspheroid with radius D.sub.1 (U, P.sub.j).
Because of the enhanced discrimination in N-space achieved by the d-neuron and the possible consequent reduction in complexity of the neural network achieved through their use, the present invention is directed to the implementation of an efficient integrated circuit structure that leads to more efficient classification system based on d-neurons.