Almost all organizations of significant size receive numerous telephone calls which must be appropriately handled based on the desires of the caller. This handling or routing is generally performed by human or automated call routing systems. Information is received from the caller, and the call is directed based on the information received. Human operators typically perform this function accurately and efficiently, but at a relatively high cost. Automated systems of the prior art often employ hierarchical menus in which a caller is confronted with a list of choices from which a selection is to be made. The caller typically selects menu options by making entries from a telephone keypad. Often, making a choice opens up a menu of further choices. In complex organizations, the menu hierarchy can be quite elaborate, requiring several choices by a caller, and requiring a caller to listen to an elaborate menu in order to understand the available choices. Such menus are a widespread cause of caller dissatisfaction with many of the presently used automated routing systems.
In many prior art call routing systems, voice recognition may be used as a substitute for keypad entries. That is, the caller is allowed to voice a number as an alternative to making a keypad entry. As presently used in call routing systems, therefore, automated voice recognition does little to simply the process. What would be more desirable to most users is a system in which the caller is able to describe his desired function and have an automated system direct the call according to the description.
Such a direct, natural language call routing system, in which a caller simply asks for the desired destination or describes the function to be performed, would greatly simplify the call routing process. However, significant obstacles to such a system exist. For example, it has been found that given such a system, callers will typically phrase their requests not by giving a destination name, but by describing the activity they would like to perform. In many cases callers have difficulty in formulating their requests, and instead provide a roundabout description of what they would like to do. Sometimes a destination name given is ambiguous, in that the precise name given does not exist, but the organization has several organizations falling under similar headings.
In such natural language call routing systems, callers may be routed to desired departments based on natural spoken responses to an open-ended prompt such as, for example, “How may I direct your call?” Note that in designing a voice response system to adequately handle these calls, it is not sufficient to include just the names of the departments in the vocabulary, and what the callers may say cannot be fully anticipated. Rather, requests from real callers should be collected for “training” the system—that is, for developing the vocabulary keywords and how calls will be routed based on the presence of such keywords in the caller's request. Data-driven techniques are essential in the design of such systems.
For example, in co-pending U.S. patent application Ser. No. 09/124,301, “Methods and Apparatus for Automatic Call Routing Including Disambiguating Routing Decisions,” filed on Jul. 29, 1998 by R. Carpenter and J. Chu-Carroll (hereinafter, Carpenter et al.), a vector-based information retrieval technique for performing call routing is described. U.S. patent application Ser. No. 09/124301, which is commonly assigned to the assignee of the present invention, is hereby incorporated by reference as if fully set forth herein. Specifically, in the system described in Carpenter et al., a routing matrix is trained based on statistics regarding the occurrence of words and word sequences in a training corpus after morphological and stop-word filtering are performed. New user requests are then represented as feature vectors and are routed based on a cosine similarity score with the model destination vectors as encoded in the routing matrix. Although the system described in Carpenter et al. is capable of routing many user requests appropriately, there are still many situations in which disambiguation (e.g., posing a disambiguating query back to the user) must be performed to properly route the call.
In a different but somewhat related application to natural language based call routing, users of an on-line document storage and retrieval system such as, for example, the Internet, often use natural language (i.e., text) to describe which document or documents they would like to retrieve. Similar problems to those described in the context of an automated call routing system exist, and as such, similar automated classification systems are required. Specifically, in both the call routing application and the document retrieval application, natural language text is used to classify the user's request into one of a fixed number of possible “destinations”—either a department or similar organizational unit in the former case, or a specific document (or a set of documents) in the latter case. (Note, of course, that in the call routing application, the natural language text is typically obtained by recognizing and converting the user's speech, whereas in the document retrieval application, the user typically types in the text directly.) Moreover, similar limitations to the performance of such automated classification systems exist when used in these document retrieval applications. Numerous other applications can also make advantageous use of a system which is able to classify natural language text into one of a number of “relevant” categories, and many of these applications also suffer from these limitations.
For the reasons described above, and regardless of the particular application (e.g., call routing or document retrieval) to which it is applied, it would be desirable for an automated natural language based classification system to be more immune to the problem of ambiguous classifications as has been the case with prior art systems of this type. That is, it would be desirable for such a system to have an improved ability to discriminate between alternative classifications which would be otherwise likely to be confused.