Speech recognition systems are specialized computers that are configured to process and recognize human speech and may also take action or carry out further processes. Developments in speech recognition technologies support “natural language” type interactions between automated systems and users. A natural language interaction allows a person to speak naturally. Voice recognition systems can react responsively to a spoken request. An application of natural language processing is speech recognition with automatic call routing (ACR). A goal of an ACR application is to determine why a customer is calling a service center and to route the customer to an appropriate agent or destination for servicing a customer request. Speech recognition technology generally allows an ACR application to recognize natural language statements so that the caller does not have to rely on a menu system. Natural language systems allow the customer to state the purpose of their call “in their own words.”
In order for an ACR application to properly route calls, the ACR system attempts to interpret the intent of the customer and selects a routing destination. When a speech recognition system partially understands or misunderstands the caller's intent, significant problems can result. Further, even in touch-tone ACR systems, the caller can depress the wrong button and have a call routed to a wrong location. When a caller is routed to an undesired system and realizes that there is a mistake, the caller often hangs up and retries the call. Another common problem occurs when a caller gets “caught” or “trapped” in a menu that does not provide an acceptable selection to exit the menu. Trapping a caller or routing the caller to an undesired location leads to abandoned calls. Most call routing systems handle a huge volume of calls and, even if a small percentage of calls are abandoned, the costs associated with abandoned calls are significant.
Current speech recognition systems, such as those sold by Speechworks™, operate utilizing a dynamic semantic model. The semantic model recognizes human speech and creates multiple word strings based on phonemes that the semantic model can recognize. The semantic model assigns probabilities to each of the word strings using rules and other criteria. However, the semantic model has extensive tables and business rules, many that are “learned” by the speech recognition system. The learning portion of the system is difficult to set up and modify. Further, changing the word string tables in the semantic model can be an inefficient process. For example, when a call center moves or is assigned a different area code, the semantic system is retrained using an iterative process.