In voice communication, speech signals are typically processed in unit of frames. The length of each frame of speech signals is generally 10 milliseconds (ms) to 30 ms. For each frame of speech signals, the basic processing process is as follows:
At a transmitter, each frame of speech signals is encoded by a speech encoder, and the encoded bits are packaged into a speech data frame; the speech data frame is transmitted via a communication channel from the transmitter to a receiver; at the receiver, the received speech data frame is decoded by a speech decoder, and the speech signal is recovered.
For a speech decoder, the recovering of a speech signal depends on the accurate reception of the speech data frame transmitted from the transmitter, and the accurate reception of the speech data frame depends on a communication channel. For the communication channel, if communication channel resources are insufficient, loss of speech data frame or error of speech data frame may occur. Currently, the impact on the communication quality of speech data frame caused by the loss of speech data frame or the error of speech data frame in the communication channel can be effectively eliminated by the Frame Erasure Concealment (FEC) technology widely used in the speech coder-decoder (CODEC).
The FEC technologies adopted by different speech CODECs may be different, but generally include operations for performing amplitude attenuation on recovered speech signals.
The FEC technology is employed in the speech CODEC to perform FEC processing on the speech data frame (corresponding to the erasure concealment frame). However, not all the speech signals are vocal signals purely produced by human voice, and the speech signals may also include background noise signals in human inactive intervals (relative to the vocal signal, the background noise signal is a non-speech signal). Energy jump may occur in the recovered signal processed by the erasure concealment because of the existence of the background noise signal (corresponding to the background noise frame produced by the speech encoder), this may cause discomfort to the hearing of the listener. Especially when the background noise frame is lost, the hearing discomfort caused by this kind of energy jump will become more serious.