1. Field of the Invention
The present invention relates to learning apparatuses, learning methods, and programs. More specifically, the present invention relates to a learning apparatus, a learning method, and a program with which dynamics can be learned efficiently.
2. Description of the Related Art
It is known that actions (movements) of robots can be described as dynamical systems defined by time-evolution rules, and that dynamical systems of various actions can be implemented by specific attractor dynamics.
For example, walking movements of a bipedal robot, such as a humanoid robot, can be described as limit cycle dynamics, which are characterized in that the states of movement of a system converge to a specific periodic orbit from various initial states. This is described, for example, in G. Taga, 1998, “Self-organized control of bipedal locomotion by neural oscillators in unpredictable environment”, Biological Cybernetics, 65, 147-159, and Gentaro Taga, “Nou to shintai no douteki dezain—Undou chikaku no hisenkei rikigakukei to hattatsu” (Dynamical design of the brain and the body—Non-linear dynamical system and development of movement and perception), Kaneko Shobo. Furthermore, a reaching operation in which an arm robot extends its arms toward a certain object can be described as fixed-point dynamics, which are characterized in that various initial states converge to a specific fixed point. Furthermore, it is also said that any movement can be implemented by a combination of discrete movements that can be implemented by fixed-point dynamics and cyclic movements that can be implemented by limit cycle dynamics.
Issues that are to be addressed in order to control actions (movements) of a robot according to attractor dynamics include designing attractor dynamics in accordance with tasks, and generating appropriate motor outputs according to the attractor dynamics on the basis of information obtained from sensor inputs. For this purpose, outputs for actions of the robot should be generated in such a manner that the attractor dynamics continuously interact with the environment.
Methods for learning attractor dynamics instead of manually designing attractor dynamics have been proposed. One of the methods uses a recurrent neural network (hereinafter referred to as an RNN). The RNN includes context units that are connected to the network via a feedback loop. It is known that, theoretically, arbitrary dynamical systems can be approximated by holding internal states in the context units.
However, in a learning model composed of one tightly connected network module, when a large number of dynamics are learned for learning actions in a large scale, considerable interference occurs among dynamics that are to be stored, so that learning becomes difficult.
In view of this problem, several learning models employing modular architectures have been proposed. In a modular architecture, a plurality of network modules are combined to form a single learning model. In the modular architecture, in principle, by increasing the number of modules, it is readily possible to increase dynamics that can be stored. However, an issue arises as to selection of a module that is to be used for learning of a given learning sample.
Depending on the method of module selection, learning methods can be classified into supervised learning and unsupervised learning. In supervised learning, assignment of learning samples to modules is determined manually. On the other hand, in unsupervised learning, assignment of learning samples to modules is determined autonomously by the learning model. In order for a robot or a system to perform learning autonomously, unsupervised learning is to be employed for learning of modules.
As a method for learning of modules by unsupervised learning, a learning model called the mixture of RNN experts has been proposed. The mixture of RNN experts is described, for example, in Japanese Unexamined Patent Application Publication No. 11-126198. According to this learning model, outputs of a plurality of RNN modules are integrated by gate mechanisms to determine a final output, and learning of modules the individual RNNs proceeds by adjusting the gates according to maximum likelihood estimation so as to maximize the performance of the final output.
However, according to the method based on global optimization, learning becomes difficult when the number of modules becomes huge.
On the other hand, in methods such as self-organization map (hereinafter referred to as SOM) or neural gas used for learning categories of vector patterns, learning rules based on global optimization are not used, so that optimality is not ensured. However, it is known that these methods allow learning an appropriate category structure in a self-organizing manner by unsupervised learning. With these methods, learning is practically possible even when the number of modules is huge. The SOM is described, for example, in T. Kohonen, “Jiko soshikika mappu” (Self-organization map), Springer-Verlag Tokyo. The neural gas is described, for example, in T. M. Martinetz, S. G. Berkovich, K. J. Schulten, ““Neural-Gas” Network for Vector Quantization and its Application to Time-Series Prediction”, IEEE Trans. Neural Networks, VOL. 4, NO. 4, pp. 558-569, 1993.