ASR technologies enable microphone-equipped computing devices to interpret speech and thereby provide an alternative to conventional human-to-computer input devices such as keyboards or keypads. A typical ASR system includes several basic elements. A microphone and an acoustic interface receive an utterance of a word from a user, and digitize the utterance into acoustic data. An acoustic pre-processor parses the acoustic data into information-bearing acoustic features. A decoder uses acoustic models to decode the acoustic features into utterance hypotheses. The decoder generates a confidence value for each hypothesis to reflect the degree to which each hypothesis phonetically matches a subword of each utterance, and to select a best hypothesis for each subword. Using language models, the decoder concatenates the subwords into an output word corresponding to the user-uttered word. Users of ASR systems utter requests to an ASR system to control different vehicle devices, or different functions of one of the vehicle devices.
One problem encountered with ASR-enabled vehicle function control is that although such a system may correctly decode a user's input speech, it may incorrectly apply the recognized speech to an unintended vehicle function. In other words, current ASR-enabled vehicle function controls have significant difficulties disambiguating between speech for one vehicle function and speech for some other vehicle function. For example, a user may say “let me hear some traffic” to have a vehicle radio play music from the 1960's rock band Traffic, but the ASR enabled vehicle controller may misinterpret the request and have another vehicle device play a roadway traffic report instead. Accordingly, users of ASR enabled vehicles become frustrated with this situation.