The history of artificial neural networks beginnings with the pioneering work of McCulloch and Pitts published in the Bulletin of Mathematical Biophysics, vol. 5, pp 115-113 in 1943, an article entitled "A Logical Calculus of the Ideas Imminent in Nervous Activity", in which it was shown that the neuron, modeled as a simple threshold device, could compute by performing logic functions. In 1962, Rosenblatt wrote a book entitled "Principles of Neurodynamics" published by Spartan Books, New York in which the "perception", an adaptive (learning) neural network with a single neuron layer was defined. Soon after, it was recognized that these single layer neural structures were of limited usefulness. The computationally more powerful multilayer adaptive networks were studied leading to the publication of the book, "Parallel Distributed Processing", MIT Press, Cambridge Mass., 1986 by D. Rumelhart and J. C. McClelland, Eds., in which multilayer adaptive learning procedures are described, including the widely known backpropagation algorithm.
Neural network prior art has been mainly based on the neuronal model of McCulloch and Pitts (MP). The MP neuron is predicated on a model that produces an output determined by the weighted sum of inputs applied to a nonlinear, usually sigmoidal, thresholding function. The neuronal model in the present invention, based on a non-MP neuron referred to as a locally receptive field or difference neuron, is predicated on the use of a difference metric representing the difference between an input vector and a modifiable reference vector metric applied to the difference neuron. The output of each neuron is controlled by the distance metric between the two vectors, the offset applied to the distance metric, and by the output nonlinearity of the difference neuron.
Four major fields of application for neural networks include:
(1) classification of noise corrupted or incomplete patterns; PA1 (2) associative or content-addressable memory for producing a class exemplar form an input pattern; and PA1 (3) vector quantization for bandwidth compressions of image and speech data. PA1 (4) Multivariate nonlinear function approximation and interpolation.
In each of these applications, the ability to discriminate clusters of data is an important attribute. A single MP neuron is only capable of dividing the decision space by a hyperplane in N-space or by a straight line in 2-space. Because of this property, the MP neuron is known as a linear separable discriminator limited to bisecting data space into two halves. Three layer networks of MP neurons are required to isolate a closed region in the signal vector space by means of hyperplanes.
A single difference neuron is capable of isolating a hyperspheroidal region in N-space by locating the center at a point defined by a reference vector, controlling its radius (or selectivity) by means of an offset, and controlling its shape by selecting an appropriate difference metric are each objects of this invention.
An article, describing prior art in difference neurons, written by T. Poggio and F. Girosi in an article entitled, "Regularization Algorithms for Learning That Are Equivalent to Multilayer Networks", Science, Vol. 247, 23 February 1990, pp. 978-982 describes a multilayer neural network consisting of a single layer of neurons in parallel with a linear network that forms a weighted sum of the input values, their outputs being combined in a single common summing junction. Each neuron forms a weighted radial basis function using a nonlinear operator on the Euclidean norm of the difference between an input vector and a reference vector. In this article the authors do not teach the application of offset to the difference norm for control of the radius nor do they teach the use of selecting norms to control the shape of hyperspheres.
Also, A. Hartstein and R. H. Koch, in an article entitled "A Self-Learning Threshold Controlled Neural Network", Proceedings of the IEEE Conference on Neural Networks, San Diego, July 1988, pp. 1-425 through 1-430, describe a Hopfield type network using a first order Minkowski distance metric for representing the difference between an input vector and a reference vector. A linear Hebbian learning rule based on the absolute error formed from the difference between the input and output vectors. The article does not teach the usefulness of offset in the control of selectivity nor the selection of a difference norm for purposes of controlling the hyperspheroidal shape.
The difference neural network of the present invention is supported by learning algorithms that adapt the reference vector weights and offset value so as to approximate the desired output response for a given input vector. These algorithms, are based on the error gradient relative to the reference vector. The reference vector consists of adjustable weights, one for each component of the input vector to each neuron cell. The offset value controls the resolution or radius of the hypersphere.