The present disclosure is generally related to technologies used for suppressing residual noise from preprocessed audio signals. More specifically, for a preprocessed audio signal that includes portions of speech, the disclosed technologies are used for suppressing residual noise from portions of the preprocessed audio signal between the portions of speech without distorting the speech portions.
A microphone of an audio receiver, e.g., of a mobile device, can receive (i) a speech signal (or simply speech) that arrives at the audio receiver along a “speech direction”, from where a user of the mobile device is expected to speak, and (ii) ambient noise along other directions, (in large part) different from the speech direction. Typically, the speech includes utterances separated by silence. As such, the microphone provides to the audio receiver an audio signal that includes portions of noisy speech (corresponding to a combination of the utterances and ambient noise) separated by portions of ambient noise (corresponding only to the ambient noise that “fills” the silence between the utterances). The audio receiver can use conventional technologies for suppressing the ambient noise from the audio signal without distorting the speech, thus forming a “speech beam” that appears to have been received at the audio receiver along the speech direction. The speech beam, referred here as a preprocessed audio signal, includes portions of speech (corresponding to a combination of the utterances and suppressed ambient noise) separated by portions of residual noise (corresponding only to the suppressed ambient noise). Although the speech included in the input audio signal can be reproduced in the portions of speech of the preprocessed audio signal with minor distortion, such that the speech distortion is hardly noticeable when a user listens to the preprocessed audio signal, the portions of residual noise of the preprocessed audio signal may sound too loud for the user.