Many systems are currently available for automatically providing services or information to callers over the telephone. Often referred to as voice response units (VRUs), such systems have normally relied upon the use of dual tone multiple frequency (DTMF) signals from push-button telephones to obtain input from a caller, and responded largely with pre-recorded voice segments. However, such a system is comparatively limited with respect to the sort of dialogue that can be maintained with the caller, given the restricted range of acceptable inputs, and the need to pre-record any possible responses.
In order to make such communications more natural, and to greatly enhance the flexibility of such systems, it is desirable to equip VRUs with other voice processing technologies, such as voice recognition (to replace the DTMF input), and text to speech (TTS) (to replace pre-recorded voice). There are many varieties of voice recognition that might be considered: for example, voice recognition may operate for discrete words or for continuous speech, and may be speaker dependent, or speaker independent. Recognition vocabularies can range from perhaps 12 words (typically ten digits plus a couple of control words) to many thousands. Likewise, there is a considerable range of TTS technologies available.
Voice recognition and TTS applications tend to be computationally very intensive; for example, full speaker-independent, large vocabulary voice recognition typically requires 100 Mips of digital signal processing power. The requirement becomes even more acute when it is remembered that a VRU may handle perhaps 100 telephone lines simultaneously lead to a potential maximum processing requirement of 10 Gips. For this reason most commercial systems use specially designed hardware to increase processing speed. These are typically available as PC adapter cards to be fitted into the VRU.
However, such cards in general must be designed for a particular system, dependent for example on the operating system (DOS or OS/2), computer architecture (ISA or Microchannel), and so on. This greatly restricts the options available to the customer who wishes to incorporate such function into a VRU, since the preferred adapter card may not be compatible with their VRU. Likewise it is difficult to optimize the different components of the system individually. The same problems may also occur even if only software components are involved: for example, a preferred voice mail product may not run under the same operating system as the preferred VRU.