Humans may engage in human-to-computer dialogs with interactive software applications referred to herein as “automated assistants” (also referred to as “chat bots,” “interactive personal assistants,” “intelligent personal assistants,” “personal voice assistants,” “conversational agents,” etc.). For example, humans (which when they interact with automated assistants may be referred to as “users”) may provide commands, queries, and/or requests using spoken natural language input (i.e. utterances) which may in some cases be converted into text and then processed, and/or by providing textual (e.g., typed) natural language input.
In some cases, automated assistants may include automated assistant “clients” that are installed locally on client devices and that are engaged directly by users, as well as cloud-based counterpart(s) that leverage the virtually limitless resources of the cloud to help automated assistant clients respond to users' queries. For example, the automated assistant client may provide, to the cloud-based counterpart(s), an audio recording of the user's query (or a text conversion thereof) and data indicative of the user's identity (e.g., credentials). The cloud-based counterpart may perform various processing on the query to return various results to the automated assistant client, which may then provide corresponding output to the user. For the sakes of brevity and simplicity, the term “automated assistant,” when described herein as “serving” a particular user, may refer to the automated assistant client installed on the particular user's client device and any cloud-based counterpart that interacts with the automated assistant client to respond to the user's queries.
Many users may engage automated assistants using multiple devices. For example, some users may possess a coordinated “ecosystem” of computing devices that includes one or more smart phones, one or more tablet computers, one or more vehicle computing systems, one or wearable computing devices, one or more smart televisions, and/or one or more standalone interactive speakers, among other more traditional computing devices. A user may engage in human-to-computer dialog with an automated assistant using any of these devices (assuming an automated assistant client is installed). In some cases these devices may be scattered around the user's home or workplace. For example, mobile computing devices such as smart phones, tablets, smart watches, etc., may be on the user's person and/or wherever the user last placed them (e.g., at a charging station). Other computing devices, such as traditional desktop computers, smart televisions, and standalone interactive speakers may be more stationary but nonetheless may be located at various places (e.g., rooms) within the user's home or workplace.
Techniques exist to enable multiple users (e.g., a family, co-workers, co-inhabitants, etc.) to leverage the distributed nature of a plurality of computing devices to facilitate intercom-style spoken communication between the multiple users. However, these techniques are limited to users issuing explicit commands to convey messages to explicitly-defined computing devices. For example, a first user who wishes to convey a message to a second user at another location out of earshot (e.g., in another room) must first determine where the second user is located. Only then can the first user explicitly invoke an intercom communication channel to a computing device at or near the second user's location, so that the first user can convey a message to the second user at the second user's location. If the first user does not know the second user's location, the first user may be forced to simply cause the message to be broadcast at all computing devices that are available for intercom-style communication. Moreover, if the first user is unaware that the second user is not within earshot (e.g., the first user is cooking and didn't notice the second user leaving the kitchen), the first user may not realize that intercom-style communication is necessary, and may speak the message to an empty room.