Generally, speech detection techniques are used to differentiate between background noise and speech. Speech can be divided into two main categories referred to as voiced speech and unvoiced speech. Voiced speech includes vowels and other phonemes, these are harsh tones and they generate a quasi-periodic signal with a relatively high magnitude when compared to unvoiced speech. Unvoiced speech includes "S's" and soft components of speech, the speech generally contains high frequency non-periodic signals with a lower magnitude as compared to voiced speech. There are two basic types of algorithms used to detect between the background noise and speech. The first type simply use the magnitude or magnitude of the signal to decide the type of data on the signal. The second type is more complicated, this type of algorithm filters the signal into several different frequency ranges and then compare magnitudes of the signals to decide if speech is contained in the signal.
The first technique uses only the magnitude of the signal and compares it against a predetermined threshold created for the background noise. This technique is extremely simple and it is widely used in applications including radiotelephones. After the background noise has been characterized, the magnitude of each section of signal thereafter is compared to the background noise threshold. If the magnitude of the section exceeds the background noise threshold, then the signal is said to contain speech. The problem with this simple method is that it is not accurate when analyzing unvoiced speech. Unvoiced speech contains relatively low energy signals, subsequentially, some of the unvoiced speech is characterized as background noise.
A second method of speech detection, as detailed in U.S. Pat. No. 4,811,404, is far more complex than the first method. Here, the incoming signal being analyzed is divided into several frequency ranges using bandpass filters or the like. Then, each frequency range is analyzed for the magnitude of the signal in that range and compared with the background noise characterization. This technique is far more accurate than the first method because it can differentiate between high frequency, low magnitude, unvoiced speech and lower frequency, low magnitude, background noise. Thus, this additional differentiation allows for a more accurate detection of speech in a signal containing background noise. However, the relatively large amount of hardware and software necessary to analyze these signals limits its application.
In radiotelephones today, there is tremendous pressure to increase battery life and reduce the size and weight of the radiotelephone. One method of reducing the power consumption of the radiotelephone is to turn-off the transmitter when there are pauses in the speech. However, this power savings must not be consumed by the technique used to shut-down the transmitter during pauses in the speech. The technique must also be able to turn on the transmitter before the signal containing the speech is ignored. Therefore, a need exists for an accurate speech detection method which is computationally simple, can be completed in real time, has small physical size and does not consume a large amount of power.