Humans may engage in human-to-computer dialogs with interactive software applications referred to herein as “automated assistants” (also referred to as “chatbots,” “interactive personal assistants,” “intelligent personal assistants,” “personal voice assistants,” “conversational agents,” etc.). For example, humans (which when they interact with automated assistants may be referred to as “users”) may provide commands, queries, and/or requests (collectively referred to herein as “queries”) using free form natural language input which may include vocal utterances converted into text and then processed and/or typed free form natural language input.
Typically, automated assistants are configured to perform a variety of tasks, e.g., in response to a variety of predetermined canonical commands to which the tasks are mapped. These tasks can include things like ordering items (e.g., food, products, services, etc.), playing media (e.g., music, videos), modifying a shopping list, performing home control (e.g., control a thermostat, control one or more lights, etc.), answering questions, booking tickets, and so forth. While natural language analysis and semantic processing enable users to issue slight variations of the canonical commands, these variations may only stray so far before natural language analysis and semantic processing are unable to determine which task to perform. Put simply, task-oriented dialog management, in spite of many advances in natural language and semantic analysis, remains relatively rigid. Additionally, users often are unaware of or forget canonical commands, and hence may be unable to invoke automated assistants to perform many tasks of which they are capable. Moreover, adding new tasks requires third party developers to add new canonical commands, and it typically takes time and resources for automated assistants to learn acceptable variations of those canonical commands.