In recent years, many entertainment robots have been marketed for family use. Assume such an entertainment robot lives with a user in a family. When a task is given to the robot by the user, for example a command of “Kick the ball” is given from the user, the robot is required not only to perform the task of “kicking the ball” but also to spontaneously take an action that is full of variety in order to avoid one-way communications and make everyday interactions with the robot not boring to the user.
Known voice interaction systems are mostly intended to perform a task such as presetting a recording operation of a video recorder, giving guidance for a telephone number or the like (see, for example, Non-Patent Document 1). Other algorithms for generating a response in a dialog include simple responsive sentence generating systems including that of Eliza, which may be a representative example (see, for example, Non-Patent Document 2).
For an entertainment robot to spontaneously take an action that is full of variety, data relating to a number of different pieces of knowledge and action patterns that are full of variety have to be given to the robot in advance for the purpose of various different interactions. However, in reality, the volume of data that can be given to an entertainment robot is limited.
Additionally, when building a robot, it will require tremendous efforts to prepare data that matches the personality of the user of the robot. Therefore, many robots are actually made to have same knowledge in advance. Then, the user of a robot may hardly feel affinity for the robot and that “the robot is unique to him or her”. On the other hand, each robot that lives in a family is required to take personalized and attractive action.
If it is possible to build a robot who can actively or passively acquire various pieces of information such as the name, the birthday, the sex of the user, what the user likes and what the user does not like by way of interactions with the user so as to be able to communicate with the user, using the acquired information, such a robot will be free from the above identified problems and satisfy the user.
Additionally, if such a robot can show the user the learning process of using conversations with the user, the user can share the learning experience. Furthermore, if the robot speaks what the robot has been taught, the user will have a feeling of intimacy to the robot.    Non-Patent Document 1: Information Processing Society, Research Meeting Report, Voice Language Information Processing 22-8 (1998 Jul. 24), pp. 41-42    Non-Patent Document 2: Physicality and Computer, Kyoritsu Publishing, pp. 258-268
However, a number of problems arise when a robot actively or passively acquires various pieces of information that are attributes of a person or a thing (object) such as the name, the birthday, and the sex of the user, what the user likes and what the user does not like, stores them in the form of memory and dialogs, utilizing the memory, as an example of conversation of the robot, using information on the user.
Firstly, there is a problem how the robot acquires memory that is customized for the user. The memory capacity of a robot is limited and the framework for storing information is defined in advance so that the robot cannot store everything that appears in conversations. Additionally, it is difficult to process the speech made by the user at unpredictable timings and store them in a memory device for the current level of technology.
However, this problem can be solved by adopting a technique by which the robot makes a speech to acquire memory such as “Tell me the name of the fried of ◯◯!” and discloses what the robot can memorize by acting on the user so as to have the user tell the value. Thus, the robot can act on the user so as to be able to collect information with ease when the robot does so to acquire memory.
Secondly, there is also a problem how the robot utilizes the acquired memory to make a speech. If the robot makes a speech, randomly utilizing the acquired memory, transition of topics will occur out of context to embarrass the user. Therefore, there needs a scheme (association) of utilizing correlated things in the memory the robot has acquired for the next speech. Additionally, if the acquired memory is output as it is, a speech can only correspond to an item of the memory on a basis of one to one correspondence so that interactions will be limited in terms of variations.
Thirdly, there is a problem when the robot acquires memory and when it utilizes the memory for a conversation. In other words, a state where an action trying to utilize the memory is taken in a situation where no information that is appendant to an object is available or a state where an action trying to acquire memory is taken in a situation where all the information has been acquired needs to be avoided. A scheme needs to be established to avoid such a state. Additionally, if the robot memorizes a thing and takes an action immediately thereafter to utilize the memorized thing, the robot may not appear to have a memory device for memorizing things and the action of the robot may not appear as intellectual action. Then, there arises a problem that the action of the robot may not be entertaining.