1. Field of the Invention
The invention relates to a learning device, a learning method and a program and, more particularly, to a learning device, a learning method and a program that are able to obtain a pattern learning model having scalability and generalization capability.
2. Description of the Related Art
A pattern learning model that learns a pattern may be, for example, RNN (Recurrent Neural Network), RNNPB (Recurrent Neural Net with Parametric Bias), or the like. The scheme of learning of those pattern learning models is classified into a “local representation” scheme and a “distributed representation” scheme.
In the “local representation” scheme, a plurality of patterns are learned in each of a plurality of learning modules, each of which learns a pattern learning model (updates model parameters of a pattern learning model). Thus, one learning module stores one pattern.
In addition, in the “distributed representation” scheme, a plurality of patterns are learned in one learning module. Thus, one learning module stores a plurality of patterns at a time.
In the “local representation” scheme, one learning module stores one pattern, that is, one pattern learning model learns one pattern. Thus, there is a small interference in memory of a pattern between a learning module and another learning module, and memory of a pattern is highly stable. Then, the “local representation” scheme is excellent in scalability that it is possible to easily learn a new pattern by adding a learning module.
However, in the “local representation” scheme, one pattern learning model learns one pattern, that is, memory of a pattern is independently performed in each of a plurality of learning modules. Therefore, it is difficult to obtain generalization capability by structuring (commonizing) the relationship between respective memories of patterns of the plurality of learning modules, that is, it is difficult to, for example, generate, so to speak, an intermediate pattern, which differs from a pattern stored in a learning module and also differs from a pattern stored in another learning module.
On the other hand, in the “distributed representation” scheme, one learning module stores a plurality of patterns, that is, one pattern learning model learns a plurality of patterns. Thus, it is possible to obtain generalization capability by commonizing memories of a plurality of patterns owing to interference between the memories of the plurality of patterns in one learning module.
However, in the “distributed representation” scheme, stability of memories of patterns is low, so there is no scalability.
Here, Japanese Unexamined Patent Application Publication No. 2002-024795 describes that contexts of two RNNs are changed on the basis of an error between the contexts of two RNNs, one of which learns a pattern and the other one of which learns another pattern that correlates with the pattern to perform learning of the RNNs, and one of the contexts of the learned two RNNs is used as a context of the other RNN, that is, a context of one of the RNNs is caused to influence a context of the other one of the RNNs to generate output data (input data are input to an input layer of an RNN, and output data corresponding to the input data are output from an output layer of the RNN).
In addition, Yuuya Sugita, Jun Tani, “Learning Semantic Combinatoriality from the Interaction between Linguistic and Behavioral Processes”, Adaptive Behavior, Vol. 13, No. 1, 33-52 (2005), describes that RNNPBs learn by changing PBs of the two RNNPBs on the basis of a difference between the PBs of the two RNNPBs, one of which learns a pattern of language and the other learns a pattern of action, and one of the PBs of the learned two RNNPBs is caused to influence the other PB to generate output data.