The present invention relates to training classifiers. In particular, the present invention relates to training classifiers to classify natural language text.
Speech utterance classification as well as text classification are well-defined problems in a growing trend of providing natural language user interfaces to automated systems. A few applications among many others are call-routing for automated call centers, natural language based help systems, or application interfaces.
For example, suppose a prospective traveler wishes to access a travel information system. Such a traveler provides a spoken or written query such as, “Show me all the flights from Seattle to DC this month.” The travel reservation system then attempts to classify the query into a finite number of possible classes such as flight information, ground transportation information, hotel information, special meal requests, etc. for routing to an appropriate customer service representative.
In another situation, a computerized help desk receives emails and/or telephone calls from employees such as “How do I enable voting buttons on my email?” or “My hard drive died.” Such a computerized help desk classifies the incoming emails and/or telephone calls into a number of possible classes such as email, operating systems, printer connectivity, web browsing, and remote access problems in order to route the communications to an appropriate technician.
A statistical classification approach (e.g. n-gram, Naïve Bayes, and maximum entropy) to the problem is common and the most successful one used so far since it deals gracefully with the irregularities of natural language and can usually be estimated in a data-driven fashion, without expensive human authoring.
One such state-of-the-art statistical approach is the Naïve Bayes classifier, whereby a sentence or sequence of words is classified as belonging to one of a number of classes. The model relies on parameters whose values need to be estimated from annotated training data in order to assign a probability to a given sequence of words or sentence. Standard methods of estimating Naïve Bayes classifiers include using maximum likelihood techniques, which estimate the model parameters for each class independently.
In the speech utterance classification process, it is possible to use a two-pass system where, in a first pass, speech input is converted to text, such as with a conventional speech recognizer. Then in a second pass, the text is classified into a class or category. However, it is also possible to use a one-pass classification system where input speech is directly classified.
Generally, different approaches (e.g. n-gram, Naïve Bayes, and maximum entropy) are used for one pass and two pass classification systems. Approaches used in one-pass systems can generally also be used in a two-pass system. However, the converse is not true: approaches that are used in a two-pass system do not necessarily work in a one-pass system. The Naïve Bayes approach is generally used for two-pass systems.
An improved method of speech utterances and text classification in automated systems having natural language user interfaces would have significant utility.