1. Field of the Invention
The present invention relates generally to electronic neural networks and, more particularly, to a neural processing module that preferably resides on a single xe2x80x9cchipxe2x80x9d and which achieves high computation rates (usually defined as the number of floating point operations per second), which operates relatively fast, but consume relatively little power and occupies relatively little space, which may be scaled in a planar or massively parallel, stacked arrangement to handle more inputs, achieve greater processing rates, or both, and which achieves its synaptic connections through binary weights that are maintained xe2x80x9coff chipxe2x80x9d so that the neural processing module may implement a variety of algorithms in different neural network applications.
2. Description of the Prior Art and Related Information
Interest in neural networks has increased because of their theoretical potential to solve problems that are difficult or even impossible to accomplish with conventional computers. Earlier researchers noted, for example, that xe2x80x9c[t]he collective behavior of neural network systems has demonstrated useful computation properties for associative memory functions, fault-tolerant pattern recognition, and combinatorial optimization problem solving.xe2x80x9d A. P. Thakoor, A. Moopenn, J. Lambe, and S. K. Khanna, xe2x80x9cElectronic hardware implementations of neural networks,xe2x80x9d Applied Optics, Vol. 26, page 5085, Dec. 1, 1987.
Early neural network research relied on software simulations performed with digital computers based on sequential Von Neuman architecturesxe2x80x94xe2x80x9cThe study of the dynamics, learning mechanisms, and computational properties of neural networks has been largely based on computer software simulations.xe2x80x9d Id. It has long been recognized, however, that neural network hardware was needed to xe2x80x9cprovide the basis for development of application-specific architectures for implementing neural network approaches to real-life problems.xe2x80x9d Id. The many simple, interconnected processors of a neural network implemented in hardware, or electronic neural network, allow for fast parallel processing, but xe2x80x9cdesigning hardware with a large number of processors and high connectivity can be quite difficult.xe2x80x9d C. Lindsey and T. Lindblad, xe2x80x9cReview of Hardware Neural Networks, A User""s Perspective.xe2x80x9d Physics Dept.xe2x80x94Frescati, Royal Institute of Technology Frescativxc3xa4gen 24 104 05 Stockholm, Sweden, 1995.
Electronic neural networks, however, have already been implemented in digital, analog, and hybrid technologies.
Digital architectures are desirable because xe2x80x9cdigital technology has the advantage of mature fabrication techniques, weight storage in RAM, and arithmetic operation exact within the number of bits of the operands and accumulators. From the users viewpoint, digital chips are easily embedded into most applications. However, digital operations are usually slower than in analog systems, especially in the weight x input multiplication . . . xe2x80x9d C. Lindsey and T. Lindblad, id. Processing speed, power consumption, and size (or density) are often critical concerns. These inventors do not know of any digital neural networks that provide sufficiently low power consumption and density to reasonably accomplish the massively parallel processing needed, for example, to perform real-time pattern recognition or feature matching. A single digital neuron is faster than an analog neuron; however, when many digital neurons are combined the size becomes larger and the propagation time between neurons will dominate. Power dissipation is also larger in a digital context.
Analog neurons are smaller and use less power than digital approaches, but are slower and subject to certain complications. For example, xe2x80x9c[c]reating an analog synapse involves the complications of analog weight storage and the need for a multiplier [that is] linear over a wide range.xe2x80x9d C. Lindsey and T. Lindblad, id.
xe2x80x9cHybridxe2x80x9d neural networks combine the xe2x80x9cbestxe2x80x9d of the digital and analog architecturesxe2x80x94xe2x80x9cTypically, the external inputs/outputs are digital to facilitate integration into digital systems, while internally some or all of the processing is analog.xe2x80x9d C. Lindsey and T. Lindblad, id. One of the hybrid neural networks discussed in the Lindsey/Lindblad article had 70 analog inputs, 6 hidden layers and 1 analog output with 5-bit digital weights, and achieved a xe2x80x9cfeed-forward processing rate [of] an astounding 20 ns, representing 20GCPS [Billion Connections Per Second]. . .xe2x80x9d
The Thakoor et al. article reference above discusses another hybrid neural network (hereafter xe2x80x9cJPL networkxe2x80x9d) which has six neurons and thirty-six synapses and which uses analog inputs and digitally programmable weights. The hybrid architecture of the JPL network allegedly offers a number of advantages by using xe2x80x9chigh-density random access digital memory to store a large quantity of information associated with the synaptic weights while retaining high-speed analog neurons for the signal processing.xe2x80x9d Id. at 5089. The authors further note that by using xe2x80x9cprogrammablexe2x80x9d synapses, xe2x80x9c[t]he hardware requirements and complexity are greatly reduced since the full interconnections of the neurons are no longer required.xe2x80x9d Id.
The JPL authors recognized that xe2x80x9ca hybrid neurocomputer can be easily expanded in size to several hundred neurons.xe2x80x9d Id. They did not, however, propose any realistic way of implementing a network with thousands of inputs or of implementing a network of any size that makes maximum use of its neurons.
There remains a need, therefore, for a low power, high density, neural processing module which achieves high computation rates, which may be scaled to achieve greater processing rates and to handle more inputs, and which may be used in an electronic neural networks that simplifies the implementation of a particular function by maintaining the weights or synaptic connections xe2x80x9coff chipxe2x80x9d by using, for example, a chip-in-a-loop arrangement that is controlled by a conventional computer.
The present invention resides in a neural processing module which combines a weighted synapse array that performs xe2x80x9cprimitive arithmeticxe2x80x9d (products and sums) with an innovative weight change architecture and an innovative data input architecture which collectively maximize the use of the weighted synapse array. In an image recognition context, the neural processing module dynamically reconfigures incoming image signals against preexisting weights and performs a corresponding successions of convolutions (products and sums) during each image frame.
In more detail, the neural processing module of the present invention achieves extremely high computation rates with lower power and lower area consumption than previously possible by providing a high speed, low power, small geometry array of analog multipliers, and by using such array as continuously as possible. The preferred neural processing module uses its synapse array almost continuously by uniquely combining:
(1) a synapse array of analog synapse cells (e.g. multipliers) and programmable synapses that receives analog data and digital weights and multiplies the analog data by the analog equivalent of the digital weights at a xe2x80x9ccalculation ratexe2x80x9d (e.g. 4 MHz);
(2) a means for rapidly loading the programmable synapses with the digital weights (determined externally, for example, by a microprocessor) at the beginning of each frame and in advance of using the synapse array; and
(3) a switching means for receiving frames of periodic input signals at an xe2x80x9carrival ratexe2x80x9d that is slower than the calculation rate (e.g. 1000 Hz), for rapidly creating a plurality of input signal permutations from the periodic input signals at a xe2x80x9cpermutation ratexe2x80x9d that is greater than the arrival rate and preferably at or greater than the calculation rate (e.g. 4 MHz), and for feeding each successive input signal permutation to the synapse array at or near the calculation rate.
The invention can be regarded as an electronic neural processing module for convolving a first group of signals with a second group of signals, comprising: means for receiving a first group of signals; switching means for receiving a second group of signals and for creating successive groups of permutated signals from the second group of signals before a next group of second signals arrives; analog multiplying means for simultaneously multiplying each signal in the first group of signals with each signal in each successive group of permutated signals to form a plurality of products; and means for accumulating the plurality of products to produce a convolution output.
The invention can also be regarded as an electronic neural network image recognition system comprising: means for receiving a plurality of weights; means for receiving successive groups of image signals (the image template) at a predetermined frame rate; switching means for creating successive groups of image permutation signals from each group of image signals [the image template] before receiving a subsequent group of image signals; a weighted synapse array of analog synapse cells that simultaneously perform a plurality of calculations at a calculation rate, wherein the calculation rate is greater than the frame rate, the plurality of calculations comprising the multiplying of each weight with each signal in each group of image permutation signals to form a plurality of products; and means for summing the plurality of products to produce a convolution output with a value that represents a correlation quality between the weights and each successive group of image permutation signals.