Automated systems for recognizing spoken natural language sequences require varying amount of processing capacity dependent upon the nature of the spoken message. It is well understood that it takes relatively less processor attention to recognize a string of spoken digits than to recognize the spoken name of an individual from among a list of thousands or even hundreds of thousands.
To appreciate this artifact of speech recognition, consider the number of potential words necessary to recite the numbers from zero to 250,000. There are the ten words for single digit numbers: `one`, `two`, `three`. . . , the ten words for the teen numbers: `eleven`, `twelve`, `thirteen`, . . . , the ten words for the decades: `ten`, `twenty`, `thirty`, . . . , and the two words, for larger place identification `hundred` and `thousand`. This relatively limited list of words for speaking a numerical string can result in rather simple or efficient processing of such spoken strings. In contrast, consider a telephone directory having 250,000 names of individuals. Each such name is potentially quite different from one another and can include names such as `Smith`, `Jones`, `Yamasaki` and `Van Rysselberghe`. It is clear that the solution for recognizing such a diverse collection of audible sounds can be more difficult than for recognizing a string of numerical digits.
Natural language speech recognition systems are currently in use for responding to various forms of commerce via a telephone network. One example of such a system is utilized in conjunction with a stock brokerage. According to this system, a caller can provide their account number, obtain a quotation for the price of a particular stock issue, purchase or sell a particular number of shares at market price or a predetermined target price among other types of transactions. Natural language systems can also be used to respond to such things as requests for telephone directory assistance.
One conventional approach to handling requests for responses to natural language speech is to establish a FIFO queue. As new requests for service enter the system, each new request is placed into the queue in the order that they were received. As a server for handling a next request completes a task and becomes available for receiving a new task, the oldest pending request is then assigned to that server. This approach does not take into account the capabilities of particular servers.
FIG. 1 shows a conventional system for handling speech utterances received via incoming telephone lines 48. One or more voice processing modules 50 each includes a plurality of Clients 52. Each voice processing module 50 includes a voice processing server 54. The voice processing server 54 for each voice processing module 50 is directly connected to all the Clients 52 in that voice processing module 50. As calls arrive in a system such as shown in this FIG. 1, they are assigned in a round-robin fashion among the various voice processing modules 50 and also round-robin to the Clients 52 with the voice processing modules 50. This prior art system does not account for any variance in use dependent upon system loading, or message type. Such a system can result in a loss of efficiency owing to ineffective work flow assignment.