In telephonic communications it is well known for the recipient of a call, particularly in a commercial setting, to utilize an automated call routing system which initially presents the calling party with a menu of routing choices from which the caller is asked to select by pressing particular numbers on the keypad associated with the caller's telephone—the routing system recognizing a tone associated with the key pressed by the caller. It is also known to offer such callers a choice between pressing a number on the keypad to select a menu choice or to say that number—e.g., “press or say one for customer service”. In the particular context of telephone services, it is also known to use an automatic routing system for selecting among billing options for a call placed over such a service. For example, in the case of a long distance telephone call to be billed to other than the originating number, a menu may be presented to the calling party in the form, “please say ‘collect’, ‘calling card’, ‘third number’ or ‘operator’”.
While such routing systems work reasonably well in cases where the number of routing choices is small, if the number of selections exceeds about 4 or 5, multi-tiered menus generally become necessary. Such multi-tiered menus are very unpopular with callers—from the perspective of a typical caller, the time and effort required to navigate through several menu layers to reach a desired objective can seem interminable. Equally important, from the perspective of both the caller and the recipient, the percentage of successful routings though such a multi-tiered menu structure can be quite low, in some cases, less than 40 percent. Stated differently, in such circumstances, more than 60 percent of the calls accessing such a multi-tiered menu structure might be either terminated without the caller having reached the desired objective or else defaulted to an operator (or other manned default station).
To address these limitations in the prior art, it would be desirable to provide a system which can understand and act upon verbal and non-verbal input from people. Traditionally, in such speech understanding systems, meaningful words, phrases and structures have been manually constructed, involving much labor and leading to fragile systems which are not robust in real environments. A major objective, therefore, would be a speech understanding system which is trainable, adaptive and robust—i.e., a system for automatically learning the language for its task.