The present invention deals with text and speech classification. More specifically, the present invention deals with an enhancement of language models to improve classification accuracy.
Natural language understanding involves using a computer to determine the meaning of text or speech generated by a user. One step in determining the meaning of natural language input is to classify the input into one of a set of predetermined classes. For example, a specific input such as “I want to book a flight to Rome” could be classified into a Travel Arrangements class. An application dedicated to this class could then be invoked to decipher further information from the input and execute the user goal represented by the input.
Such classification is a well-defined problem in natural language processing. Specific examples of practical applications include call routing for automated call centers and natural language based help systems.
Classifiers can be used to facilitate the classification process. Common examples of classifiers include statistical classifiers such as n-gram, Naive Bayes and Maximum Entropy classifiers. In n-gram classifiers, statistical language models are used to assign natural language word strings (i.e., sentences) to a class. Specifically, a separate n-gram language model is constructed for each class. At run-time, the language models are used in parallel to assign a probability to a given test word string or speech utterance. The class associated with the language model demonstrating the highest probability to the test word string or utterance is designated as the class to which the string/utterance belongs. The class assignment need not be one-to-one. The test sentence or utterance can be assigned to a set of N-best class candidates depending on the probability that each class receives given the test string or utterance. For speech classification, n-gram classifiers have the advantage that they can be used in a one-pass scenario wherein speech utterance recognition and classification are integrated.
A straightforward way of training an n-gram classifier is to train the language model for each class separately using maximum likelihood (ML) estimation. Although such training schemes are easy to implement, they produce a classifier with limited classification accuracy.