1. Technical Field
The present invention relates to information processing systems, and in particular to a system which rapidly learns to replicate observed responses to information.
2. Discussion
One goal of information processing systems research is to minimize the need for manual steps in the information processing task. While in the current state of the art many aspects of information processing are performed automatically, those stages which still rely on manually performed operations are generally seen as the weak link in the performance of the overall system.
To be specific, in information processing systems which perform such functions as signal processing, image processing, pattern recognition, process control, and resource allocation, manually performed operations result in a number of disadvantages. These include: 1) inconsistent performance, which can result from human factors such as fatigue, forgetting, etc.; 2) poor response time, which is inherent in the speed of human responses versus those of electronic systems; 3) too few "experts", which is common when the manual operation requires a person of a high level of training and skill; 4) non-robust/non-adaptive response, which typically arises from limitations in the ability of the system (or human-interface) to adapt to novel situations, changing conditions, etc.; and 5) unfriendly/interfering interaction, which results from awkward user interfaces and burdensome training procedures.
A number of approaches have been used to improve the automation of the above-discussed kinds of information processing tasks in order to avoid the many problems associated with manual steps. However, these approaches are generally unsatisfactory for various reasons. For example, direct automation is often too complex and costly to be practical. Artificial intelligence and expert system approaches are difficult to configure and result in a non-general system useful for only a narrow range of applications.
Explicit algorithms are another commonly used technical approach. These include, for example, the Simplex and Greedy algorithms, as well as fixed and heuristic algorithms. While these algorithms can operate relatively fast, they are generally computationally expensive and require considerable effort to explicitly set up the problem. Also they are not trainable and do not adapt well to variations in the data or problem structure. Conventional neural network and fuzzy logic systems are often not dependable or robust enough for many applications. Genetic algorithms are often impractical because they are usually slow and ungainly. Conventional adaptive control systems are generally non-evolutionary in that they can only adapt within a vary narrow range and cannot operate when the input/output parameters are significantly altered. Consequently, in the many settings wherein the above approaches have been employed, the systems have usually not progressed beyond the "toy" phase, and users often revert back to former manual techniques.
This reaction is not surprising since humans can generally detect subtle patterns and perform data analysis, synthesis, and fusion much better than many of the currently available automated techniques. Nevertheless, automated assistance would be highly desirable to improve performance in four main areas: 1)speed, 2) repeatability, 3) dependability, and 4) distribution of expertise.
Thus it would be desirable to provide a system which is a self-organizing adaptive replicate of human (expert) behavior. It would also be desirable to have such a system which can learn from the behavior of humans (or other systems) without requiring explicit rules and instructions, and without interfering with the behavior it is learning from. It would further be desirable to provide such a system which can be used to either assist the human in his performance of the task or, once trained, to take over the task entirely.
One approach toward a system with the above-described desired features, is a self-organizing neural network architecture known as the Adaptive Resonance Theory (ART). This approach is attractive in part due to its ability to self-organize by adding processing nodes as required. However, the ART Network is generally too complicated and computationally-intensive for many kinds of implementations. For further information regarding the ART Network see S. Grossberg, "Competitive Learning: From Interactive Activation To Adaptive Resonance", Cognitive Science 11:23-63 (1987). Another related neural network approach is known as the Boltzmann Machine. However, the Boltzmann Machine is not robust enough to achieve the desired goals. For further information on Boltzmann Machines see G. E. Hinton, and T. J. Sejnowski "Learning and Re-Learning in Boltzmann Machines". in Parallel Distributed Processing Volume 1 pp 282-317, Cambridge, Mass.: MIT Press (1986).
Another important neural network architecture is the three layer perceptron. With a non-linear hidden layer, the three layer perceptron guarantees that an arbitrary mapping of continuous spaces exists. Also, the distributed architecture of the multi-layer perceptron allows it to handle noisy or corrupted inputs and network conditions. However, the commonly used training paradigm for the three layer perceptron, known as backpropagation, suffers from a number of disadvantages. Backpropagation learning in the perceptron is generally slow. Also, it involves relatively complex calculations. Moreover, this approach does not work well for training on real-world, real-time inputs, since the training set must be specially ordered to prevent early training on one type of example to be "forgotten" by the network after subsequent training on another example. Further, the supervised training employed with backpropagation complicates the training process unduly, requiring complete retraining when only new data becomes available.
Hence, in order to achieve a system with the above-described features it would be desirable to provide a neural network type architecture which is able to add or subtract nodes as required while at the same time employing a relatively simple, easy to implement architecture that avoids complex calculations. Further, it would be desirable to provide a neural network architecture which learns (or re-learns) rapidly from a training data set which it receives from the real world in real-time, without requiring any reordering of the training data, or complete re-training.