This application relates to neuromorphic networks, and in particular, to tunable optical neuromorphic networks.
Neuromorphic networks are widely used in pattern recognition and classification, with many potential applications from fingerprint, iris, and face recognition to target acquisition, etc. The parameters (e.g., ‘synaptic weights’) of the neuromorphic networks are adaptively trained on a set of patterns during a learning process, following which the neuromorphic network is able to recognize or classify patterns of the same kind.
A key component of a neuromorphic network is the ‘synapse,’ at which weight information is stored, typically as a continuous-valued variable. For applications that would benefit from compact, high-performance, low-power, portable neuromorphic network computation, it is desirable to be able to construct high-density hardware neuromorphic networks having a large number of synapses (109-1010 or more). Currently a neuromorphic network is typically realized as a software algorithm implemented on a general-purpose computer, although hardware for neuromorphic networks exist.
Neuromorphic networks may be used for three broad types of learning. In “supervised learning” a set of (input, desired output) pairs is provided to the neuromorphic network, one at a time, and a learning algorithm finds values of the “weights” (the adjustable parameters of the neuromorphic network) that minimize a measure of the difference between the actual and the desired outputs over the training set. If the neuromorphic network has been well trained, it will then process a novel (previously unseen) input to yield an output that is similar to the desired output for that novel input. That is, the neuromorphic network will have learned certain patterns that relate input to desired output, and generalized this learning to novel inputs.
In “unsupervised learning,” a set of inputs (without “desired outputs”) is provided to the neuromorphic network, along with a criterion that the neuromorphic network is to optimize. An example of such a criterion is that the neuromorphic network is able to compress an input into a smaller amount of information (a “code”) in such a way that the code can be used to reconstruct the input with minimum average error. The resulting “auto-encoder” network consists of, in sequence, an input layer having a number of neurons or nodes, one or more “hidden” layers, a “code” layer (having relatively few neurons or nodes), one or more hidden layers, and an output layer having the same number of neurons or nodes as the input layer. The entire network is trained as if this were a supervised-learning problem, where the “desired output” is defined to be identical to the input itself.
In a third type of learning, “reinforcement learning,” a “reward/penalty” value is provided (by an external “teacher”). The “reward/penalty” value depends upon the input and the network's output. This value is used to adjust the weights (and therefore the network's outputs) so as to increase the average “reward.”
For learning, a solution involves using multilevel programming of each synaptic resistance unit, and using the functional capability of the controllers to program the synaptic levels, while maintaining very compact synapse structures (e.g., a PCM element plus one to three transistors, depending upon a desired configuration). For example, using 30 nm technology, a synaptic density of 3.6×109 cm−2 may be achieved, with 6×104 controllers attached to each x-line and each y-line. The controllers may consist of 104 or more transistors. The energy required per synapse per step (i.e., per weight change) is several pico-Joules (pJ). For each presentation of an input to the neuromorphic network during learning, the desired weight updates at all the synapses may be performed in a time on the order of 0.02 seconds. During the recognition stage (i.e., following synapse training), the energy consumption and recognition time per image may be reduced.
Neuromorphic network applications may include pattern recognition, classification, and identification of fingerprints, faces, voiceprints, similar portions of text, similar strings of genetic code, etc.; data compression; prediction of the behavior of a systems; feedback control; estimation of missing data; “cleaning” of noisy data; and function approximation or “curve fitting” in high-dimensional spaces.