1. Field of the Invention
The present invention relates to systems that include mechanisms operable to receive information and to analyze that information on the basis of a learning mode of operation.
2. Description of the Related Art
The present invention adds improvements to the prior inventions of the present inventor referenced above. While these prior inventions provide adequate self-organizing circuit features, improved performance and reduction in costs can be achieved by the additions disclosed herein.
The improvements are of two basic types: those that apply to improved circuit design and those that apply to improved "teaching" of the circuit. Improved circuit design first includes a method to better allow the circuit elements of a self-organizing circuit to learn new patterns quickly, secondly includes a mechanism by which serial or sequential information can be learned, and thirdly includes mechanisms by which the circuits can be simplified by reducing the number of interconnections within the circuit. Improved teaching of the circuit includes ways by which the self-organizing circuit can be quickly taught new patterns. First by making each input to a subcircuit compete against the many other inputs to that subcircuit, by weighting each input according to simple branch functions, and lastly by incorporating a method by which information can be added to the circuit after the circuit has already learned some information. The circuit makes better distinctions between patterns by incorporating modified subcircuits which are change-sensitive and by making the subcircuit competition be sensitive to change. Lastly, a method of stabilizing and destabilizing subcircuits using signals which are sent to all nodes, lets the subcircuits organize themselves into persistent patterns.
Pattern recognition includes the ability of a circuit to detect a pattern among variables despite the fact that the pattern is not precisely the same pattern as was previously learned. The variables can be considered as any variable or set of variables from which a signal can be formed, in some way functionally related to the variables considered. The types of variables fall into two broad categories: static variables and time-varying variables. For example, when a color-blind person tries to distinguish between letters or numerals of pastel dots, he is given static variables or static information. Time-varying variables for which patterns might be recognized include audio signals, for example a person trying to distinguish between the dash and dot patterns he hears in a Morse code signal.
Clearly living organisms can accomplish this task of pattern recognition. People can recognize static information such as printed material (as the reader of these very words is now doing) and time-varying information such as how to swing a tennis racket so as to make proper contact with a tennis ball. Lower life forms also have this ability: certain ant species can recognize the foliage cover near their nests to orient themselves; certain moths can recognize the high-pitched sounds of a bat to avoid being captured; and even clams can learn primitive patterns of tactile responses which distinguish food from danger. Living organisms use electrochemical signals in the neurons of their brain or ganglion to perform this pattern recognition function.
While very complicated computers have been built which can do enormous numbers of calculations at speeds far exceeding the simple calculations done by house flies and clams, the ability of such computers to perform pattern recognition at the level of these primitive organisms has not been forthcoming. A major difference is that people tell the computers what to do whereas flies and clams tell themselves what to do. The former are essentially preprogrammed to do certain sequences in attempts to recognize patterns in space or in time while the latter self-organize themselves to "learn" to recognize patterns which are important to them. In each case, a certain amount of information is already known: in the computer it is a programming language (software) plus the myriad of interconnections in its circuitry; in the living organism it is its instincts or programmed patterns plus the myriad of interconnections in its neural circuitry.
It will be noted that in the last few years considerable research has been devoted to neural networks based on an approach by John Hopfield (see, for example, Proc. Natl. Acad. of Sci., Vol. 81, pp. 3088-3092, May 1984). When "taught" patterns, these neural networks have some of the same properties of the prior patents (U.S. Pat. Nos. 4,479,241, 4,774,677, 4,989,256 and 5,161,203 by the present inventor) and the present invention. For example, both methods can take arbitrary input patterns of binary information and detect when one of several patterns has been detected. Both methods use a multiplicity of "voter" subcircuits having simple binary outputs determined by combining neighboring outputs, weighting them either positively or negatively. Both methods are insensitive to noise--the input patterns during learning or recognition tasks may be only approximate copies of the exact input patterns and still detect the correct pattern. In a variation of the Hopfield algorithm Geoff Hinton and Terry Sejnowski use random outcomes of the subcircuits to better allow their networks to stabilize on a particular pattern (Cognitive Science, Vol. 9, 1985), much as the present invention uses random outcomes in eliminating the need for training of intermediate subcircuits.
But here the similarity ends. Hopfield, Hinton, Sejnowski and their colleagues all use "network optimization" methods for training their networks. Rather than using local outcomes of nearby nodes to adjust the interactions between subcircuits as does the present invention, neural networks optimize the network in total. Errors are detected at the input and output subcircuits and interactions between subcircuits are adjusted based on network-wide optimizations rather than on local competition between the subcircuits. In addition, present neural networks deal with time-varying patterns of inputs by transforming them into combinational patterns for which network optimization methods are well suited. The present invention can accept either combinational or sequential patterns as inputs and can output either combinational or sequential patterns as outputs.
Since neural networks rely on an optimization method, every node in the network must be adjusted as part of the learning process. As the number of nodes becomes large, the time required to learn input patterns becomes very large; some estimates show that learning time is proportional to the cube of the nodes in the network. In addition, all input patterns to neural networks must be learned at the same time: during learning all the input patterns are cycled through over and over again as the network adjusts to all the possible input sets. To learn a single new input pattern requires that the entire original input pattern set plus the new pattern be learned in its entirety.
Consequently, most neural network solutions have been limited to relatively small networks, typically less than a thousand nodes in order to keep learning time reasonable. Where larger networks are required, as in translating kanji characters, smaller networks are combined to give the proper output. Several small networks each work on a subset of the identification problem and then other neural networks combine the intermediate results.
By contrast, the self-organizing circuits and algorithms described use a direct approach rather than an optimization approach. Computations are not performed on all nodes, but rather only on nodes which meet the time filtering criteria. This time filtering identifies just those few nodes which require modification; learning a new input pattern takes approximately the same computation time regardless of the size of the network. Networks are not limited to a small number of nodes.
Moreover, new patterns can be learned by the present invention after the network has already learned other patterns. The network of nodes need not learn all the input patterns at once but can add information over time. Humans seem to use a similar technique: we don't have to relearn everything we know just to learn, say, a new phone number.