Spoken language understanding systems typically include one or more models that they use to processing input. For example, automatic speech recognition systems typically include an acoustic model and a language model. The acoustic model is used to generate hypotheses regarding which words or subword units (e.g., phonemes) correspond to an utterance based on the acoustic features of the utterance. The language model is used to determine which of the hypotheses generated using the acoustic model is the most likely transcription of the utterance based on lexical features of the language in which the utterance is spoken.
Acoustic models, language models, natural language understanding models, and other models used in spoken language understanding (together referred to as spoken language understanding models), may be specialized or customized to varying degrees. For example, an automatic speech recognition system may have a general or base model that is not customized in any particular manner, and any number of additional models for particular genders, age ranges, regional accents, or any combination thereof. Some systems may have models for specific subject matter (e.g., medical terminology) or even specific users.