1. Field
The disclosure relates generally to a computer implemented method, a data processing system, and a computer implemented program code. More specifically, to a method of processing limited natural language data to automatically develop an optimal feature set, bypassing the standard Wizard of OZ (WOZ) approach is provided. The method provides for building natural language understanding models, or for processing existing data from other domains, such as the Internet, for domain-specific adaptation using maximum length semantic tokens. Consequently, when the optimal maximum length semantic tokens are passed to an engine, the maximum length semantic tokens produce robust models that can be used for natural language call routing.
2. Description of the Related Art
Call centers are increasingly choosing to develop a natural language call routing solution to replace a traditional tree-structure based touch-tone interactive voice response (IVR) application when the application contains a large number of menu options. Natural language call routing refers to applications that have an initial open-ended prompt where users are not directed in terms of what they can or cannot say. The typical example of this opening prompt is as, “Hi, I am an automated assistant here to direct your call. How may I help you?” In response to the prompt, users may freely describe their requests in their own words. This approach provides a natural human-machine interaction and reduces the burden on users to go through a lengthy tree structure of menu options, especially if the menu options contain over a hundred choices.
The design and development of a natural language call routing system ideally involves a number of different individuals, all contributing to the call routing system. User interface experts, business metric experts, speech scientists and domain experts all collaborate to create a systematic procedure for defining classes, collecting data through Wizard of Oz, writing specifications of how to label the data, and hiring and supervising user interface experts to label the data.
To design and develop a call routing system, the following three steps are involved: 1.) Design the classification classes, including clear and vague target classes; 2.) Develop disambiguation module including a prompt and disambiguation grammar for each vague target class; and 3.) Develop language and call routing model. However, in practice, call center operators are often unwilling to invest in this laborious and expensive process. It is not unusual that a call center operator would provide a page or two describing the types of routing targets desired, and expect the engineer to build an initial system based solely on the short specification.
Classes including clear and vague classes are first defined while designing the natural language call routing systems. Clear target classes are typically the terminal nodes of the tree describing the menu options of the interactive voice response system. They are commonly given by business requirements. A vague target class is associated with multiple clear target classes. It can be, but is not limited to, the intermediate nodes in the menu tree of the interactive voice response system. Vague target classes are artificial classes from overlap of clear targets in semantic space. They are commonly designed and maintained by the business analyst, Voice User Interface (VUI) designer, and speech scientist.
Excluding empty and all classes, in theory, if there are n clear target classes, the maximum number of possible vague target classes is given by equation 1:2n−n−2.  Equation 1