Spoken language understanding systems have been deployed in numerous applications which require some sort of interaction between humans and machines. Most of the time, the interaction is controlled by the machine which asks questions of the users and then attempts to identify the intended meaning from their answers (expressed in natural language), and then takes action in response to these extracted meanings.
One important class of applications employs Natural Language Understanding (NLU) technology for a type of semantic classification known as “call routing,” whose goal is to semantically classify a telephone query from a customer to route it to an appropriate set of service agents based on a brief spoken description of the customer's reason for the call. Call routing systems reduce queue time and call duration, thereby saving money and improving customer satisfaction by promptly connecting the customer to the right service representative in large call centers.
Determining a semantic classification for a human utterance in a call routing system is typically a five-step process as illustrated by FIG. 1. Input speech from the caller is translated into a text string by an Automated Speech Recognition (ASR) Module 101. The ASR text is output into an NLU semantic classification component known as a Statistical Router 102. The Statistical Router 102 models the NLU task as a statistical classification problem in which the ASR text corresponding to an utterance is assigned to one or more of a set of predefined user intents, referred to as “call routes.” Various specific classifiers have been compared in the literature with similar performance (1-2% differences in classification accuracy), including, for example, Boosting, Maximum Entropy (ME), and Support Vector Machines (SVM). For example, Statistical Router 102 may use binary unigram features and a standard back-propagation neural network as a classifier.
The Statistical Router 102 typically has an unacceptably high error rate (10-30% classification error rates are commonly reported in deployed applications), and thus a rejection mechanism is implemented to only retain those route hypotheses which are most likely to be correct. The rejection decision should not be based only on the confidence in the classification of the Statistical Router 102 because the ASR Module 101 can also make recognition errors which should be taken into account. Therefore, another separate classifier—Confidence Engine (CE) 103—is used to produce confidence scores based on both acoustic and NLU features to determine the highest ranked N hypotheses (typically 3-5) output from the Statistical Router 102. A Route Reordering Component 104 then reorders the route hypotheses according to their overall confidence as determined by the CE 103. The best scoring route hypothesis is sent to Threshold Decision Module 105 which accepts the hypothesis if its confidence score is above an accept threshold. The value of the accept threshold is chosen so that the system satisfies one or more operating constraints such as an upper bound on the False Accept Rate (FAR) (typically 1-5%).
The performance of a semantic classification system such as a call router is usually derived from its Receiver Operating Characteristic (ROC) curve. The ROC plots the False Accept Rate (FAR), the percentage of incorrectly routed calls whose confidence scores exceed the accept threshold, against the Correct Accept Rate (CAR), the percentage of correctly routed calls whose confidence scores exceed the threshold, at various thresholds. An Automation Rate (AR) is computed as the percentage of calls which are automatically routed by the system (FAR+CAR) at a given operating point (confidence threshold) and is one of the main system parameters considered when deploying a call routing system. The rejection component has rarely been mentioned in recent call routing literature, in which most studies focus on methods to improve the accuracy of Statistical Router 102 and simplify its training. As a consequence, there is no existing discussion on the actual effectiveness of a call routing system as measured by its Automation Rate.