Intelligent automated assistants (or virtual assistants) provide an intuitive interface between users and electronic devices. These assistants can allow users to interact with devices or systems using natural language in spoken and/or text forms. For example, a user can access the services of an electronic device by providing a spoken user input in natural language form to a virtual assistant associated with the electronic device. The virtual assistant can perform natural language processing on the spoken user input to infer the user's intent and operationalize the user's intent into tasks. The tasks can then be performed by executing one or more functions of the electronic device, and a relevant output can be returned to the user in natural language form.
The acoustic environment in which the virtual assistant operates can affect the virtual assistant's ability to interpret a user's spoken input. For example, background noise, such as music, conversations of other individuals, traffic noises, or the like, can obscure the user's spoken input contained in the audio received by the virtual assistant. This can result in words being interpreted incorrectly or not at all. Thus, it can be desirable to operate a virtual assistant in an acoustic environment that is conducive to performing speech recognition.