Computers have become ubiquitous in our daily lives. Today, computers do much more than simply compute: supermarket scanners calculate our grocery bill while tracking store inventory; computerized telephone switching centers direct millions of calls; automatic teller machines (ATMs) allow people to conduct banking transactions from almost anywhere—the list goes on and on. For most people, it is hard to imagine a single day in which they will not interact with a computer in some way.
Formerly, computer users were forced to interact with computers on the computer's terms—by keyboard or mouse or more recently, by touch-tones on a telephone (called DTMF—for dual tone multi-frequency). More and more, however, the trend is to make interactions between computers easier and more user-friendly. One way to make interactions between computers and humans friendlier is to allow humans and computers to interact by spoken words.
To enable a dialogue between human and computer, the computer first needs a speech recognition capability to detect the spoken words and convert them into some form of computer readable data, such as simple text. Next the computer needs some way to analyze the computer-readable data and determine what those words, as they were used, meant. A high-level speech-activated, voice-activated, or natural language understanding application typically operates by conducting a step-by-step spoken dialogue between the user and the computer system hosting the application. Using conventional methods, the developer of such high-level applications specifies the source code implementing each possible dialogue, and each step of each dialogue. To implement a robust application, the developer anticipates and handles in software each possible user response to each possible prompt, whether such responses are expected or unexpected. The burden on the high-level developer to handle such low-level details is considerable.
As the demand for speech-enabled applications has increased, so has the demand on development resources. Presently, the demand for speech-enabled applications exceeds the development resources available to code the applications. Also, the demand for developers with the necessary expertise to write the applications exceeds the capacity of developers with that expertise. Hence, a need exists to simplify and expedite the process of developing interactive speech applications.
In addition to the length of time it takes to develop speech-enabled applications and the level of skill required to develop these systems, a further disadvantage of the present mode of speech-enabled application development is that it is vendor specific, significantly inhibiting reuse of the code if the vendor changes, and application specific, meaning that already written code can not be re-used for another application. Thus a need also exists to be able to create a system that is vendor-independent and code that is re-useable.
Additional background on IVR systems can be found in U.S. Pat. No. 6,094,635, Jul. 25, 2000, titled “System and Method for Speech Enabled Application”; in U.S. Pat. No. 5,995,918, Nov. 30, 1999, “System and Method for Creating a Language Grammar using a Spreadsheet or Table Interface” and in U.S. patent application Ser. No. 09/430,315 filed Oct. 29, 1999, “Task Oriented Dialog Model, and Manager.” These patents are commonly assigned to Unisys Corporation.