This specification relates to speech recognition.
Speech recognition may be used for voice search queries. In general, a search query includes one or more query terms that a user submits to a search engine when the user requests the search engine to execute a search. Among other approaches, a user may enter query terms of a search query by typing on a keyboard or, in the context of a voice query, by speaking the query terms into a microphone of, for example, a mobile device.
When submitting a voice query through, for example, a mobile device, the microphone of the mobile device may record ambient noises or sounds, otherwise referred to as “environmental audio” or “background audio,” in addition to spoken utterances of the user. For example, environmental audio may include background chatter or babble of other people situated around the user, or noises generated by nature (e.g., dogs barking) or man-made objects (e.g., office, airport, or road noise, or construction activity). The environmental audio may partially obscure the voice of the user, making it difficult for an automated speech recognition (“ASR”) engine to accurately recognize spoken utterances.