A communication system can determine when communication parties start to talk and when they stop talking by using a Voice Activity Detection (VAD) technology. When the communication parties stop talking, the communication system may not transmit signals, thus saving channel bandwidth. The existing VAD technology is not limited to the voice detection of the communication parties, and may also detect the signals such as a Ring Back Tone (RBT).
A VAD method generally includes: extracting classification parameters from the signals to be detected; and inputting the extracted classification parameters into a binary determination criterion, in which the binary determination criterion determines and outputs a determination result, and the determination result may be that the input signals are foreground signals or the input signals are background noise.
The existing VAD methods are based on a single classification parameter. A VAD method based on four classification parameters also exists at present, the four classification parameters involved in this method are Spectral Distortion (DS), full-band Energy Distance (DEf), low-band Energy Distance (DEl), and Differential Zero-Crossing rate (DZC), and 14 determination conditions are involved in a determination criterion of this method.
In the implementation of the present disclosure, the inventor finds that the prior art at least has the following problems:
False determination easily occurs if the VAD method based on a single classification parameter is used. Because the coefficients in the 14 determination conditions are all constants, the determination criterion fails to have an adaptive adjustment capability according to an input signal, causing undesirable performance of the method.