1. Field of the Invention
The present invention relates to apparatus implementing a neural network, including a dynamic sleep refresh capability for reinforcing memory traces which might otherwise wash out over time.
2. Description of Related Art
A neural network is a highly parallel form of information processing apparatus, having as its philosophical basis the attempted modeling of the mammalian brain. A typical neural network comprises a large number of nodes, also called neurons or neuronal circuits, each intended to operate in some way analogous to a neuron in a mammalian brain. The neurons are connected to each other by synaptic connective circuits, also referred to as synapses or simply as connections, each intended to operate in some way analogous to a synapse in a mammalian brain.
Like the mammalian brain, the organization of a neural network is such that the each neuron is cross-coupled via synapses to a large number of other neurons in the network. The coupling in many cases is bi-directional and in some cases is uni-directional. Because of the philosophical basis for much of the work in the neural network area, it often happens that advances in the field help to improve the scientific understanding of the mammalian brain, and advances in brain research help to advance the state of the art of neural networks.
When a neural network operates, environmental input signals are provided to inputs of a subset of the neurons in the network. Following some algorithm, the signals propagate through the network and finally activate one or more output neurons representing a processed version of the input signals. For example, where the inputs are connected to the various pixels in a CCD imaging array, the outputs may be such as to specify which of the 26 letters of the alphabet is imaged on the CCD array.
Typically, the transfer functions by which information propagates through the network are modifiable according to a learning algorithm of some kind. In this way the network can learn to recognize different input patterns, depending on the environment to which it is exposed. Usually, the learning algorithm is either of a supervised form or of an unsupervised form. In a network which learns by supervised learning, the output response is measured against some predetermined correct output response either by a human or by an automatic supervisor. The network is then caused to modify its own transfer functions, often in dependence upon the difference between the actual and the predetermined output response, iteratively until the predetermined output response is achieved. The object of such a system is to have it find its own way to a predetermined relation between input signals and output responses.
In unsupervised learning networks, the network constructs its own distinctive output response for any given input signal. Such a network should itself be able to determine, for example, that different varieties of the letter "B" belong in a different category from the various forms of the letter "A". Such a network would modify its own transfer functions in order to provide a different output response for each category of input signals.
An example of a supervised learning algorithm is the so-called back-propagation algorithm, illustrated in U.S. Pat. No. 3,950,733 to Cooper. An example of an unsupervised learning algorithm is that discussed in Fukushima, "Neocognitron: A Hierarchical Neural Network Capable of Visual Pattern Recognition," Neural Networks, Vol. 1, pp. 119-130 (1988).
Another unsupervised learning algorithm is used in the ART-type networks proposed by Carpenter and Grossberg. See, for example, Carpenter and Grossberg, "A Massively Parallel Architecture For a Self-Organizing Neural Pattern Recognition Machine," Computer Vision, Graphics, and Image Processing, Vol. 37, pp. 54-115 (1987) ("Grossberg I") and Carpenter and Grossberg, "The ART of Adaptive Pattern Recognition by a Self-Organizing Neural Network," IEEE Computer (March 1988) at 77-88 ("Grossberg II"). In its simplest form, the ART network is composed of two layers F1 and F2 of neurons. The F1 layer is composed of a large number of clusters, each of which, in turn, is composed of a number of nodes with mutually inhibitory interconnections. Each of the input neurons or nodes takes in synaptic signals from three types of sources: one from environmental input stimuli, another from nodes in F2 (top-down connections), and a third from a global control mechanism. The F2 layer is also composed of clusters of nodes. Each F2 cluster has mutually inhibitory nodes only one of which is active at any one time. F2 activity is controlled by input from two main types of sources: one from Fl nodes (bottom-up connections) and the other from a resetting mechanism. Both top-down and bottom-up connections are Long-Term Memory (LTM) connections, meaning that the time-constants or decay constants of synaptic strengths are much longer compared to that of Short-Term Memory (STM) activity in F1 and F2. It is noteworthy that the output signals generated by the neurons are binary, indicating either active on inactive Analog activity levels generated internally to the neuron form only one of several parameters to the formula by which the binary outputs are generated.
The transfer functions according to which information propagates through the network usually have a predefined form with coefficients or other constants which are modifiable in accordance with the learning algorithm. For example, in the ART model, each neuron v.sub.k has a respective "STM activity" x.sub.k which obeys a membrane equation of the form: EQU .epsilon.dx.sub.k /dt =-x.sub.k (1-Ax.sub.k)J.sub.k.sup.+ -(B+Cx.sub.k)J.sub.k.sup.-,
where .epsilon., A, B and C are constants, where J.sub.k.sup.+ is the 15 total excitory input to v.sub.k, and where J.sub.k .sup.- is the total inhibitory input to v.sub.k. Each of the values J.sub.k derives from an equation of the form ##EQU1## where v.sub.l is a neuron the output of which influences neuron v.sub.k, f(x.sub.l) is a fixed form binary output function of the internal activity level x in neuron v.sub.l, and z.sub.lk is a coefficient which varies in accordance with the learning algorithm.
Coefficients such as z, which may be thought of as connectivity weights, can be stored as charge levels on a capacitor in each of many circuits implementing respective synaptic connections. Such implementations suffer from the problem that capacitors by their nature are imperfect, and lose charge over time. Current implementations therefore often use expensive CCDs or floating gate MOSFETs to retain memory as long as possible. In at least one instance, workers have even gone to the extent of cooling the capacitors to -100.degree. C. in order to stem leakage and prevent memory loss. Mackie, et al., "Implementations of Neural Network Models in Silicon," NATO ASI Series, Vol. F41, "Neural
Computers," pp. 467-476, at 472 (1987). In other instances, memory is retained digitally. These technologies are expensive and do not lend themselves to optimal integration densities. No attempt has been made to stem charge loss through the use of circuitry.
In addition to gradual loss of memory due to charge leakage on the storage capacitors, such memory loss is sometimes intentionally designed into a system. For example, Carpenter and Grossberg recognized that even long-term mammalian synapses have a characteristic decay constant related to how fast the chemical transmitters/receptors are lost in the absence of preand post-synaptic activity. Their ART model therefore includes a so-called associative decay rule, which implies that some connectivity weights decay towards zero during learning See Grossberg I at 77.
While the associative decay rule models one aspect of the functioning of the mammalian brain, it is not perfect. As an illustration, suppose an ART system has learned six categories representing the letters A-F. Suppose further that for a long time thereafter, the system is presented with environmental inputs representing only the categories A-D. In the ART model, the memory traces by which the system recognizes the categories E and F will eventually wash out. However, we know that human beings can remember images and events that took place many years earlier, even without repetition of the stimuli.
It is therefore an object of the present invention to provide neural network apparatus in which memory traces are dynamically refreshed in order to counteract the loss of memory due to any cause.