Humanoid robots, for example, robots having human characteristics, represent a major step in applying autonomous machine technology toward assisting persons in the home or office. Potential applications encompass a myriad of daily activities, such as attending infants and responding to queries and calls for assistance. Indoor humanoid robots may be expected to perform common household chores, such as making coffee, washing clothes, and cleaning a spill. Additional applications may include assisting elderly and handicapped persons. Humanoid robots will be expected to perform such tasks so as to satisfy the perceived desires and requests of their users. Through visual and voice recognition techniques, robots may be able to recognize and greet their users by name. In addition, robots should be able to learn through human interaction and other methods. In order to meet these needs, such robots must possess the requisite knowledge.
In particular, to accomplish an indoor task, an autonomous system such as a humanoid robot needs a plan with steps. It is desirable that the robot be able to derive such a plan dynamically, that is, taking into consideration aspects of the actual indoor environment when the task is to be performed. It is also desirable to derive plans based on “common sense” knowledge. For example, steps for executing common household or office tasks could be collected from non-expert human volunteers using distributed capture techniques.
Human actions have been analyzed on a variety of levels. At the most basic level, execution of an action can be represented by a motor response schema of sensory motor mapping. A description of this can be found in R. A. Schmidt, A Schema Theory of Discrete Motor Skill Learning, Psychological Review, 82(4):225-260, 1975, which is incorporated herein by reference in its entirety. At the most abstract level, concepts such as scripts and Memory Organization Packets (MOP) have been proposed to represent the organization of well-learned activities such as going to a restaurant or visiting a doctor for surgery. A description of this can be found in R. C. Schank and R. Abelson, Scripts, Plans, Goals and Understanding, Lawrence Erlbaum Associates Ltd., Hove, UK, 1977, and in R. C. Schank, Dynamic Memory: A Theory of Reminding and Learning in Computers and People, Cambridge University Press, Cambridge, 1982, both of which are incorporated by reference herein in their entirety. When a MOP is activated, only one step is generally carried out at a time, but steps can sometimes be combined with other activities. For example, one can read while waiting at doctor's office.
Between these extremes lies a range of practical, well-learned activities like making breakfast, cleaning one's teeth, dressing and so on. At this mid-level, Cooper and Shallice presented a computational model for selection of steps for routine tasks based on competitive activation within a hierarchically organized network of action schemas. Their activation model for sequential step selection was based on the Contention Scheduling theory of Norman and Shallice. A description of this can be found in Richard Cooper and Tim Shallice, Contention Scheduling and the Control of Routine Activities, Cognitive NeuroPsychology, 17(4):297-338, 2000, and in D. Norman and T. Shallice, Attention to Action: Willed and Automatic Control of Behavior, pages 1-18, Plenum Press, New York, 1980, both of which are incorporated by reference herein in their entirety. The Cooper-Shallice model was demonstrated for the routine task of preparing coffee. Under normal functioning, the model was able to generate a sequence of simple actions (pick up spoon, dip spoon in sugar bowl, etc.) culminating in a drinkable cup of coffee.
In contrast, work in artificial intelligence (AI) planning falls under the category of goal-controlled exploratory behavior. Attempts are made to reach the goal using knowledge of different plans, and a successful sequence is selected. Such planning is important during execution of tasks. A description of this can be found in Daniel S. Weld, Recent Advances in AI Planning, AI Magazine, 20(2):93-123, Summer 1999, which is incorporated by reference herein in its entirety.
One conventional approach has utilized expert systems to encode the steps for accomplishing a task algorithmically. A key component was the capture of human expert knowledge using a laborious manual process. A description of this can be found in D. A. Waterman, A Guide to Expert Systems, Addison Weseley, 1986, which is hereby incorporated by reference in its entirety. A disadvantage of this approach is that not everything that humans learn is taught by experts. Most day-to-day activities, e.g., tying shoe laces, are learned by observations of and interaction with non-experts.
According to Rasmussen et al., human activity in such routine tasks is goal-oriented and controlled by a set of proven rules. The sequence of task steps is typically derived empirically, communicated from another's knowhow or a “cookbook” sequence. A description of this can be found in Jens Rasmussen, Skills, Rules and Knowledge: Signals, Signs, and Symbols, and Other Distinctions in Human Performance Models, IEEE Transactions on Systems, Man and Cybernetics, SMC-13(3):257-266, May/June 1983.
One source of common sense knowledge is the Worldwide Web (“web”). For instance, websites such as eHow.com list the steps to perform activities. Intel Corporation developed a system called Probabilistic Activity Toolkit (PROACT) to build activity models. They automatically identified activities by observing the objects involved in the activity. They also found the relevance of various terms to a given activity from the web. For instance, the word “cup” is highly related to the activity making tea because “cup” occurs frequently on web pages about making tea. A description of this can be found in Matthai Philipose, Kenneth P. Fishkin, Mike Perkowitz, Donald Patterson, and Dirk Haehnel, The Probabilistic Activity Toolkit: Towards Enabling Activity-Aware Computer Interfaces, Technical Report IRS-TR-03-013, Intel Research Laboratories, November 2003, and in Mike Perkowitz, Matthai Philipose, Kenneth Fishkin, and Donald J. Patterson, Mining Models of Human Activities from the Web, Proceedings of the 13th Conference on World Wide Web, pages 573-582, ACM Press, 2004, both of which are incorporated by reference herein in their entirety.
Using the web as an open information source for building plans for tasks is very attractive. However, the extracted knowledge exhibits high variance and “noise”, e.g., extraneous or erroneous information, and documents may be prohibitively large. An alternative is a distributed information source such as the Open Mind Indoor Common Sense (OMICS) database. In compiling this database, volunteers are prompted with household tasks and asked to provide steps to accomplish them. A description of this can be found in R. Gupta and M. Kochenderfer, Common Sense Data Acquisition for Indoor Mobile Robots, Nineteenth National Conference on Artificial Intelligence (AAAI-04), Jul. 25-29 2004. However, even with this approach, semantic information must be extracted from the steps provided.
From the above, there is a need for a practical system and method for building plans from distributed knowledge for enabling autonomous machines such as humanoid robots to perform tasks in constrained environments such as indoor environments.