Homes, offices, and public spaces are becoming more wired and connected with the proliferation of computing devices such as notebook computers, tablets, entertainment systems, and portable communication devices. As computing devices evolve, the way in which users interact with these devices continues to evolve. For example, people can interact with computing devices through mechanical devices (e.g., keyboards, mice, etc.), electrical devices (e.g., touch screens, touch pads, etc.), and optical devices (e.g., motion detectors, camera, etc.). Another way to interact with computing devices is through audio devices that capture human speech and other sounds using microphones. When interacting with a computing device using speech, the computing device may perform automatic speech recognition (ASR) on audio signals generated from sound captured within an environment for the purpose of identifying voice commands within the signals.
Traditional text messaging systems remain popular despite many limitations imposed by these systems. For example, text messages take time to compose using a physical or virtual keyboard and require multiple steps to send the messages to a recipient. To transmit a message, a user typically has to open an application, select a recipient, draft a message for the recipient, and then select a send command to cause transmission of the message. These steps may be difficult to perform when a user is moving or performing other tasks at the same time.
Voice communications are often limited to synchronous communications where the communications occur in real time and to voicemail communications. When using conventional voicemail communications, users often spend an unnecessarily long amount of time to obtain a stored voicemail message because a user typically has to call a voicemail service and then listen to instructions, commands, or other announcements prior to listening to the voicemail. Thus, exchanging messages using voicemail is often impractical and disfavored by many users due to the amount of time required to retrieve the voicemail. In addition, the voicemail typically does not include any additional information that may help a user understand a context of the message.