1. Field of the Invention
The present invention relates generally to telecommunication systems and methods, and, more particularly, to telephone call management systems and methods for detecting sound commands within sound streams of a telephone conference.
2. Background
Even with the relatively recent proliferation of email, instant messaging, and similar communication technologies, telephone services remain important to an average person or business. Indeed, it appears that the number of individual telephone lines in use is constantly increasing. The number and sophistication of feature functions available from both telephone systems and telephone service providers also continue to increase. Call answering, voice messaging, and automated attendant (“auto attendant”) are some of the more popular feature functions commonly offered by telephone systems and service providers.
An auto attendant system typically answers the incoming calls, greets the callers, and transfers the calls to selected extensions. Some auto attendant systems interact with the callers using, for example, dual tone multi-frequency (DTMF or touch-tone) key input. Other auto attendant systems accept voice input, which they process using automatic speech recognition capabilities. Still other systems can receive and process both DTMF and voice input.
A telephone system, such as an auto attendant system, may also enable participants in a telephone conference to issue DTMF or voice commands to the system during a telephone conference. For example, a participant may be able to issue voice or DTMF commands to disconnect from the conference or put the conference on hold. The system would attempt to recognize the keyword or keywords (keyphrase) of the command and the channel of the speaker who uttered the command, and, upon recognition, invoke appropriate application code to perform the command.
Telephone networks are typically terminated by analog or circuit switched trunks that produce audio reflections, also known as echoes. For example, when two or more channels CHN {CHN0, . . . , CHNK} are patched via a computer telephony (CT) bus, a small echo signal may be transmitted from each channel to all the other channels involved. On channels CHN1-CHNK, for example, echo from channel CHN0 may be present. As a result, multiple channel-independent signal analysis algorithms that analyze real-time audio data from a particular single channel, such as automatic speech recognition or tone detection algorithms, often produce positive-detect results near-simultaneously. A telephone system may interpret an echo of a keyword uttered or DTMF input sent by a participant at a telephone connected to the channel CHN0 as a keyword uttered or DTMF signal sent by one or more other conference participants connected to the conference via the channels CHN1-CHNK. As a consequence, the system may invoke additional instances of the application code to perform the command corresponding to the echoes of the keyword or DTMF signal. The command may thus be performed multiple times, and affect conference participants who did not issue the command, disrupting the conference.
Mistaking of echoes as actual commands is particularly troublesome when the command that produces the echo is a disconnect signal. Disconnect signals are used in some telephone systems that do not generate loop current drop when a caller hangs up. Instead, these systems typically generate a dial tone-type disconnect signal that is detected by a terminating device, such as an auto attendant system. When a person connected to a multi-party conference from such a system disconnects from the conference, an echo of the disconnect signal may be detected on a channel of another conferee. As a result, the other conferee may also be disconnected from the conference.
It would be desirable to avoid or reduce instances of processing of telephone conference echoes as actual commands.