A communication system can determine when communication parties start to talk and when they stop talking by using a Voice Activity Detection (VAD) technology. When the communication parties stop talking, the communication system may not transmit signals, thus saving channel bandwidth. The existing VAD technology is not limited to the voice detection of the communication parties, and may also detect the signals such as a Ring Back Tone (RBT).
A VAD method generally includes: extracting classification parameters from the signals to be detected; and inputting the extracted classification parameters into a binary judgment criterion, in which the binary judgment criterion judges and outputs a judgment result, and the judgment result may be that the input signals are foreground signals or the input signals are background noise.
The existing VAD methods are based on a single classification parameter. A VAD method based on four classification parameters also exists at present, the four classification parameters involved in this method are Spectral Distortion (DS), full-band Energy Distance (DEf), low-band Energy Distance (DEl), and Differential Zero-Crossing rate (DZC), and 14 judgment conditions are involved in a judgment criterion of this method.
In the implementation of the present invention, the inventor finds that the prior art at least has the following problems:
False judgment easily occurs if the VAD method based on a single classification parameter is used. Because the coefficients in the 14 judgment conditions are all constants, the judgment criterion fails to have an adaptive adjustment capability according to an input signal, causing undesirable performance of the method.