This invention relates to hierarchical stacked neural networks that learn through processing information and constructing new actions in a manner that simulates cognitive development in the human brain. Such networks are used to build systems that learn and make complex decisions in the same manner as the human brain. The present invention models the ordered stages that the brain moves through during development that allow it to perform increasingly complex actions at higher stages of development. In this developmental process, actions performed at a particular stage of development are created by ordering, combining, and transforming the actions performed in the immediately preceding stage. As a result of this process, at each stage of development more complex actions can be performed than those performed at the immediately preceding stage.
Actions include all operations performed by a neural network that result in a change of state of the system. Actions are combined to perform tasks. More complex actions permit the performance of more complex tasks.
Prior-art neural networks, in contrast to the present invention, are not modeled on the cognitive development of the human brain. They employ simple models of both biological systems and the physiological structure of the brain to process information and perform tasks. When prior-art, architecturally distinct neural networks are linked together to form hierarchies, the complexity of the actions performed in consecutive neural networks does not increase at higher levels in a hierarchy. Actions performed in lower level networks in the hierarchy are not systematically ordered, combined, and transformed to create higher-stage actions in higher-level networks in the hierarchy in the manner that the human brain uses during learning and development. As a result, prior-art neural networks, whether or not hierarchical, cannot perform many of the complex tasks that humans perform easily.
Neural networks were developed initially to overcome the limitations of expert systems that solve problems and make decisions based on predetermined decision sets and responses. These expert systems have no intelligence, that is, they lack the ability to learn, because they are able to solve only those problems that their creators have already solved. Neural networks were created to overcome these limitations with models of neural systems that simulate the brain's capacity to learn novel representations and to tolerate ambiguity, both symbolic and informational.
Neural networks are based on simple models of how neural systems function in the brain. The primary component of neural systems in the brain are neurons. In the brain, each neuron is typically connected electrochemically to thousands of other neurons. A neuron is activated when the electrochemical stimulation that it receives from surrounding neurons reaches a threshold value. The neuron then “fires”, sending electrochemical signals that either activate or inhibit surrounding neurons, which may in turn become activated and “fire.”
The simplest prior-art neural networks comprise a series of artificial neurons. Unidirectional signals pass between artificial neurons over predetermined connections. Each neuron typically receives signals from a number of other neurons. Each connection between one neuron and another has a weight associated with it that represents the strength of the sending neuron's signal. An activation function associated with the receiving neuron multiplies and sums the weights of the signals that it receives from other neurons and computes whether the neuron will fire. When the neuron fires, it sends signals that either activate or inhibit other internal neurons or cause the network to output an external response. Connection weights between neurons are adjusted by training algorithms based on the neural network's production of successful outputs. These connection weights comprise the neural network's knowledge or learning.
Prior-art neural networks improve task performance, information processing, and decision making by expanding the core biological model of neural function and brain structure. Approaches include, but are not limited to, adding layers of neurons, increasing the interconnections between neurons, restructuring the relationships between groups of neurons to more closely parallel brain structure, and developing more efficient training and processing algorithms through more sophisticated mathematical modeling. Because prior-art neural networks model only neural function and the physiological structure of the brain, they are limited in their capacity to perform tasks, solve problems, and manipulate complex information in the same manner as the human brain, which performs complex tasks by hierarchically combining and ordering lower-stage actions.
To increase the capacity of prior-art neural networks to solve problems accurately and to expand their abstract abilities, some prior-art neural networks comprise more than one neural network. Architecturally distinct neural networks are linked to other networks hierarchically, in parallel, in tree structures, or in other configurations. Such linked neural networks allow greater levels of abstraction and multiple views of problems. In prior-art neural networks that are linked hierarchically, information moves up through the system of neural networks, with output from each lower-level neural network cascading up to the level above it. The lower levels identify patterns based on the input stimuli. These patterns are then fed to the higher levels, with input noise reduced and with increasingly narrow representations identified, as output from one neural network moves to the next. In this movement through the series of networks, a winnowing process takes place, with information reduced as decisions are made concerning the identity of the object or concept represented by a pattern. Thus, in the process of eliminating the noise in the input stimuli, the complexity, subtlety, and meaning of information are lost. Neural networks at higher levels operate on less information than neural networks at lower levels, and their tasks become simpler rather than more complex. The result is that complexity and context, which are critical for meaning, are lost.
This simplification is contrary to how the human brain operates at higher stages of development. At such higher stages, the brain's capacity to process complex information and distinguish between finer shades of meaning increases rather than decreases. The brain increases its capacity during development by ordering, combining, and transforming lower-level actions to construct new actions that respond to a richer array of stimuli and greater levels of meaning, with the result that the brain can perform more complex tasks.
Prior-art neural networks modeled on core neural processes and brain structure have made advances in artificial intelligence, such as recognizing written letters and numbers, reading English text aloud, playing rule-based games such as chess, and determining whether sonar echoes have been reflected from undersea rocks or from mines. They have not been able to simulate the human brain's capacity to assign higher levels of meaning to speech and deduce complex interrelationships between objects, time sequences, and conceptual categories. In other words, prior-art neural networks cannot perform the higher-level cognitive operations performed by the human brain. While neural systems and brain structure provide a basis for neural networks to perform low-level cognitive functions, such a reductionist model limits the capacity of prior-art neural networks to learn how to perform higher level actions, thereby making decisions based on meaning and nuances between input stimuli. Therefore prior-art neural networks simulate the functioning of the human brain on only a simple neuronal level that lacks the human brain's capacity to make the higher-level distinctions and perform the higher-level tasks that humans perform with ease. Thus there is a need for a system of hierarchical stacked neural networks that can simulate the human brain more closely and for methods to employ such a system to learn to perform tasks on its own.