Speech recognition systems are generally known in the art, particularly in relation to telephony systems. U.S. Pat. Nos. 4,914,692; 5,475,791; 5,708,704; and 5,765,130 illustrate exemplary telephone networks that incorporate speech recognition systems. A common feature of such systems is that the speech recognition element (i.e., the device or devices performing speech recognition) is typically centrally located within the fabric of the telephone network, as opposed to at the subscriber's communication device (i.e., the user's telephone). In a typical application, a combination of speech synthesis and speech recognition elements is deployed within a telephone network or infrastructure. Callers may access the system and, via the speech synthesis element, be presented with informational prompts or queries in the form of synthesized or recorded speech. A caller will typically provide a spoken response to the synthesized speech and the speech recognition element will process the caller's spoken response in order to provide further service to the caller.
A particular application of these types of systems has been the creation of “electronic assistants”, sometimes referred to as “virtual assistants” or “automated attendants”. For example, U.S. Pat. No. 5,652,789 (hereinafter referred to as “the '789 patent”) describes a service that allows a subscriber to the service to manage personal communications through the use of an electronic assistant. Using speech recognition technology, the subscriber can issue voice-based commands to manage incoming and outgoing calls and messages. As in typical telephony-based systems, the speech recognition element described in the '789 patent is located entirely within the fabric of the telephone infrastructure. One feature particularly described in the '789 patent is the ability for the speech recognition element, in providing the electronic assistant service, to enter a “background mode” while the subscriber is engaged in a voice communication with another party. While in this background mode, the electronic assistant monitors the subscriber's voice communication for the occurrence of a given set of voice-based commands, particularly a “summoning command” that causes the electronic assistant to enter a “foreground mode”. In the foreground mode, the electronic assistant continues to monitor for a larger set of voice-based commands. In this manner, the electronic assistant is literally “on call” to service the needs of the subscriber and is invoked through the detection of a particular summoning or “wake up” command.
As noted above, the ability to invoke the electronic assistant described in the '789 patent is enabled through the speech recognition element deployed within the telephone network. Various other systems implementing similar electronic assistant services are currently available to the public. Likewise, these systems are enabled through the use of network-based speech recognition elements. These systems generally offer acceptable performance due, in part, to the nature of the telephone network. Because latencies or delays are typically small (on the order of a few milliseconds) within most telephone networks, the use of infrastructure-based speech recognition elements is practicable, particularly as applied to “wake up” commands for electronic assistants. However, current systems have generally failed to address wireless systems. Given the fluctuating nature of wireless communication channels (i.e., time-varying degradation factors and throughput delays), and the differences in speech processing applied, for example, in different cellular systems, the use of purely infrastructure-based speech recognition elements is likely to be problematic. Current solutions also utilize a full speech channel and dedicated network resources to provide a “wake up” to speech recognition functionality. These methods make inefficient use of “airtime” and network resources for network-based speech recognition enabled services. This is a significant factor in determining the cost to provide these services. Thus, it would be advantageous to provide a more efficient technique that allows subscribers to electronic assistant services, or other speech-based services, to be able to “wake up” speech recognition functionality in a wireless communication environment.