During transmission of an audio signal, noise may be caused due to various reasons. When severe noise occurs in an audio signal, normal use of a user is affected. Therefore, noise in an audio signal needs to be detected in time, so as to eliminate noise affecting normal use.
In an existing noise detection method, a time-domain signal of an audio signal is analyzed, which focuses on analysis of a parameter related to time-domain energy variations of the audio signal. However, time-domain energy variations of some noise signals are normal, making it difficult to detect these noise signals using the existing noise detection method.
FIG. 1 is a time-domain waveform graph of a speech signal, where a horizontal axis is a sample point, and a vertical axis is a normalized amplitude. In the speech signal shown in FIG. 1, speech-grade noise is on a left side of a dashed line 11, a first section of normal speech is between the dashed line 11 and a dashed line 12, a metallic sound is between the dashed line 12 and a dashed line 13, a second section of normal speech is between the dashed line 13 and a dashed line 14, and background noise is on a right side of the dashed line 14. The speech-grade noise is a type of special noise, and a normal speech signal may be indistinguishable or may sound unnatural due to occurrence of speech-grade noise. The metallic sound is noise sounds like a metallic effect, and is relatively high-pitched. The speech-grade noise, the metallic sound, and the background noise all are noise signals. However, it can be learned from FIG. 1 that only the metallic sound has a relatively large amplitude variation, and waveforms of the speech-grade noise and the background noise are relatively similar to a waveform of a normal speech signal. Therefore, according to a time-domain waveform of a speech signal, it is difficult to distinguish such noise whose waveform is similar to that of a normal speech signal from the normal speech signal.
It can be seen that the existing noise detection method is applicable only to detection of a signal having short duration, a relatively large energy variation, and a sudden variation, and has low accuracy in detecting noise whose time-domain signal characteristic is similar to that of a normal speech signal.