Spoken language understanding (SLU) may be used in a variety of different systems that attempt to understand users' queries and other user input. For instance, SLU may be an important module used in a dialog system that attempts to understand users' utterances. Given an utterance, SLU may be used to extract a semantic frame to represent the utterance's intent and semantic slots.
SLU models are often trained from domain specific queries with semantic annotation. Various features, including N-grams, rules, dictionaries, etc., may be used to train SLU models. The same set of features may also be extracted at run time for semantic decoding.
A dictionary used by an SLU model includes entities that belong to the same entity class (e.g., movie names, music tracks, etc.). As it is difficult to obtain enough training data to cover all semantic slots in a domain, such as hundreds of thousands of movie names and music tracks, dictionaries may be used to increase model coverage and improve the model's performance. Experiments show that large and clean dictionaries are effective to improve a model's accuracy. The impact is more dramatic when the test data are quite different than training data, in which case, contextual features like n-grams are not sufficient.