Adaptive information processing systems have been extensively explored during the past several years. Some of the most notable systems include the Adaline and Madaline systems at Stanford Electronic Laboratory, the Perceptron at Cornell Aeronautical Laboratories, and the Minos I and II at Stanford Research Institute. Some of the U.S. patents which relate to adaptive information processing systems are U.S. Pat. No. 3,287,649 to Rosenblatt; U.S. Pat. No. 3,408,627 to Kettler et al.; U.S. Pat. No. 3,435,422 to Gerhardt et al.; U.S. Pat. No. 3,533,072 to Clapper; and U.S. Pat. No. 3,601,811 to Yoshino. This list of references is merely exemplary and constitutes only a small part of the large body of prior art in existence to date.
Such prior art adaptive information processing systems operate, in general, to produce an output response for a given input signal, which response is measured against some predetermined (correct) output response. These prior art systems are caused to modify themselves, or "learn", often in dependence upon the difference between the actual and the predetermined output response until the predetermined output response is achieved. The object of such a system is to have the system find its own way (by some algorithm) to a predetermined relation: EQU input signal .fwdarw. output response.
It should be noted here that whenever the term "input signal" is used in this discussion it is intended to include the possibility of a set of separate input signals which are applied, substantially simultaneously, to a corresponding set of input terminals of an information processing system. Similarly, the term "output response" is intended to define the entire system response to a given input signal, although this response may comprise a plurality of individual output responses appearing substantially simultaneously at a set of system output terminals.
A typical prior art adaptive system is illustrated in FIG. 1. This system comprises, as its essential elements, a network of inputs 1, 2, 3 . . . , N, which are respectively connected to a plurality of variable weighting elements G.sub.1, G.sub.2, G.sub.3 . . . , G.sub.N having variable weights which, for example may be variable gains in the case of weighting amplifiers or variable resistances in the case of variable resistors. The outputs of the weighting elements G are applied to a summer S which produces a single network output in proportion to the sum of the weighting element outputs. The weighting value of each weighting element G.sub.1, G.sub.2, G.sub.3 . . . , G.sub.N is individually controlled by means of a so-called "training algorithm" T that conditions the network to respond to a particular input signal with a desired output response.
In operation of the network, a particular signal is repetitively applied at the network inputs 1, 2, 3 . . . , N. After each application of the specific input signal, the network output response is compared to a predetermined desired output response, for example by means of a subtractor D, and the difference, or error, is utilized in the training algorithm to modify the weights of the individual weighting elements G.sub.1, G.sub.2, G.sub.3 . . . , G.sub.N.
Each application of the specific input signal, and the subsequent modification of the weighting elements G, is called a "training cycle". As successive training cycles occur, the network output response approaches more closely the desired output response until the network is conditioned to respond uniquely to the particular input signal which is to provide the desired output response.
In the adaptive information processing systems of the prior art, emphasis has been given to finding a suitable training algorithm which permits a system to "learn" or adapt to the applied input signals at a rapid rate. Needless to say, numerous ingenious algorithms have been devised; however, in all cases the training algorithm has been made dependent in some way upon the predetermined desired output which is to be generated in response to a given input.
It is an object of the present invention to provide an adaptive information processing system which has the ability to construct its own distinctive output response for any given input signal. In particular, it is an object of the present invention to provide a system with the striking characteristic that it can modify itself to construct an internal mapping -- input signal .fwdarw. output response -- that functions as a memory or a program without any outside intervention or choice as to what output response is desired or what input pattern is presented. This type of training procedure or self-modification of the adaptive information processing system will hereinafter be called "passive learning" or "passive modification".
The importance of this ability of a system to passively modify itself will be appreciated by considering a simple example. Because it is not necessary with such a system to know, beforehand, a predetermined, desired output response for a given input signal, it is possible to apply input signals with unknown content to the system and, after a period of training, determine the informational content of the input signals by considering the output responses. For instance, if the unknown input signals happen to be informational signals (having some unknown structure) that are buried in noise, since the structure of the output responses is isomorphic to that of the buried informational signals, the unknown structure will be mapped into and be represented by the output responses. In this way the unknown informational content of any input signals may be deciphered by the information processing system.
It is also an object of the present invention to provide an adaptive information processing system which, like the systems of the prior art, can produce a predetermined, desired output response to any given input signal. This procedure, which will hereinafter be called "active learning" or "active modification", requires knowledge on the part of the human operator of the desired output response to be associated with each individual input signal.
It is a further object of the present invention to provide an adaptive information processing system in which the learning growth rate -- that is, the rate at which the system trains itself to produce a particular output response in terms of the number of presentations of an input signal -- is very rapid. In particular, it is an object of the present invention to provide an information processing system having an exponential, rather than linear or other slower, learning growth rate.
It is a further object of the present invention to provide an adaptive information processing system that is capable of functioning as a memory which is distributed and highly invulnerable to the malfunction of individual components. Such a memory will be an adaptive and self-organizing memory that has the ability to acquire information solely as a result of experience. In addition, this distributed memory in general will have the capacity, reliability and accuracy of a conventional digital computer memory (such as a ferrite core) of the type that stores information at a local site.
It is a further object of the present invention to provide an adaptive information processing system which is capable of great density of storage. For example, it is noted that the information processing system is capable of realization by integrated circuitry and does not require discrete elements such as ferrite cores.
It is a further object of the present invention to provide an adaptive information processing system which is capable of great rapidity of operation; more particularly, a system in which of the order of or more than 2.sup.n bits of information can be recalled and/or processed in a single electronic operation (where n is the number of system output terminals).
Finally, and perhaps most importantly, it is an object of the present invention to provide an adaptive information processing system which is capable of exhibiting each of the following properties:
1. Recognition: The ability to produce a strong output response to an event or input signal that the system has seen before. Obviously, the information processing system will initially respond diffusely to a particular input signal. However, after successive presentations of that input signal the system will learn to "recognize" the input signal by producing a characteristic output response.
2. Recollection: The ability to produce a unique output response for each of a number of particular input signals. This characteristic provides the function of memory since the system is thereby able to produce a unique output response on its (n) output terminals (containing of the order of or more than 2.sup.n bits of information) upon receipt of a particular input signal on its set of input terminals.
3. Generalization: The ability to extract a common element from a number of different events or input signals. In particular, if a number of differing input signals are successively applied to the information processing system, the system will learn to recognize a feature that is common to these input signals. For example, if a particular informational signal that is buried in noise is repeatedly applied to the system input terminals, the system will extract, retain, and subsequently recognize the informational signal.
4. Association: The ability to recall a first input signal upon receipt of a second after the two input signals have been applied to the information processing system more or less concurrently. That is, when two input signals are simultaneously applied, the system will not only learn these input signals, but will "associate" one with the other. Thus, at a future time, the system will be able to recall either one or both of the input signals if only one of the input signals is applied. This characteristic of association can be effective, for example, in the analysis of unknown signals. If two unknown input signals are applied to the system, the system will be able to determine whether one is related to the other in any way.
5. Retrieval From Partial (Fragmentary) Inputs: The ability to retrieve an entire input signal from a portion of that input signal. This characteristic may be viewed as a "self-association"; that is, "association" between parts of the same signal. If a particular input signal is applied to the system until it is "learned", the system will "associate" any portion of this signal with the entire signal so that, at a later time, the application of a portion of the input signal will result in the production by the system of the entire signal (usually with a reduced signal to noise ratio).