1. Field of the Invention
The present invention relates to a learning control apparatus, a learning control method, and a computer program. More particularly, the present invention relates to a learning control apparatus, a learning control method, and a computer program for an autonomous learning agent having multiple dimensional variables (sensor input, internal state, and motor output), the autonomous learning agent estimating a causal relationship between variables based on learning of a predictor, and automatically determining the number of controllers, functions, and input and output variables based on the estimated causal relationship in order to automate modularization and layering of the controllers.
2. Description of the Related Art
Structure of an autonomous agent using reinforcement learning is disclosed by Richard S. Sutton, and Andrew G. Barto in the book entitled “Reinforcement Learning: An Introduction” MIT Press, 1998. Experience-reinforced autonomous agent is disclosed by Jun Tani in the paper entitled “Learning to generate articulated behavior through the bottom-up and the top-down interaction processes”, Neural Networks, Vol. 16, No. 1, pp. 11-23, 2003. In these disclosed techniques, input and output variables of a learner are manually selected by humans who take into consideration a task to be solved and an expected behavior.
As for multi-degree-of-freedom agent, if a task and input and output variables are determined during design phase, learning capability of the agent is limited from the design phase. The known techniques are thus subject to serious problem in the construction of an open-ended autonomous agent.
If all conceivable sensor and motor variables are used as inputs and outputs to solve the manual selection problem, performance in individual task and expected behavior is affected. This is well known as curse of dimensionality in the field of machine learning (as disclosed by R. E. Bellman, in the book entitled “Dynamic Programming” Princeton University Press, Princeton. 6: 679-684).
To overcome this problem, an autonomous agent is segmented in a plurality of functional modules in learning. However, this process leads to two new problems which do not exist if learning is performed with a single functional module.
A first problem is how to determine the number of functional modules and the degree of freedom of each module (a quantity for determining how complex structure one module is allowed to have). A second problem is how a link between modules is determined.
MOSAIC disclosed in Japanese Unexamined Patent Application Publication No. 2000-35804 is interesting in terms of function modularization. However, each module needs to handle all variables as inputs and outputs, and MOSAIC thus fails to overcome the two new problems. To overcome the two new problems, humans need to design beforehand a link with the functions of the modules. MOSAIC remains unchanged from a design approach incorporated in the known autonomous agent.