This application claims priority of UK application No. 0120672.1 filed Aug. 24, 2001. That application is incorporated herein by reference.
The present invention relates in general to voice recognition systems for telephony, and more specifically to a method and apparatus for providing voice and tone detection prior to allocation of a speech recognition engine to a call.
The integration of speech recognition into modern day PBX systems provides new user interface capabilities to augment traditional telephone device DTMF tones and xe2x80x98featurexe2x80x99 keys for call control. Speech recognition capabilities may be provided through the allocation of speech recognition engines (SREs) to a call in progress. For example, PBX systems manufactured by Mitel Networks Corporation may be configured with a number of ports for allocating Speak@Ease(trademark) SRE resources. Each SRE resource is a general purpose xe2x80x9cdevicexe2x80x9d which provides all speech recognition and related capabilities (which may be composed of one or more processes). These capabilities include, but are not limited to, voice detection, DTMF detection, voice recognition, and application processing.
As speech recognition becomes more common, it is anticipated that a much larger number of SRE resources will be required to accommodate increased utilization. The provisioning of additional SRE resources to meet anticipated usage increases the overall cost of a PBX installation. As a result, the potential penetration of speech recognition applications is subject to cost considerations and is limited except where cost justified.
According to the existing state-of-the-art, SRE resources are associated with a call whenever there is a potential need for speech recognition, regardless of whether speech recognition is actually invoked during the call. Consequently, PBX systems are now configured with a plurality of SRE resources that are dedicated to servicing one or more speech recognition applications, in a PBX network. When all of the SRE resources are in use, subsequent requests for the supported speech recognition applications are denied or deferred until an SRE resource becomes available. When the SRE is servicing a user, all capabilities are provided, regardless of utilization. For example, if a user initiates a request for which an SRE is allocated and simply dials digits at the telephone device (i.e. dialing the destination number rather than speaking the name) then the full capabilities of the SRE are underutilized. However, as indicated above call control allocates the SRE resource whenever speech recognition may be required, regardless of actual utilization.
According to the present invention, a voice and DTMF detector resource (VDD) is allocated to a call prior to allocating an SRE resource. The SRE resource is only allocated when speech recognition capabilities are required. The Voice and DTMF detector resource (VDD) is a limited capability digital signal processor that can be provided in volume at relatively low cost (using existing Digital Signal Processing (DSP) technology). The presence or absence of the Voice and DTMF detector resource (VDD) does not impact the SRE resource.