Originally, the Internet Protocol (IP) network is designed for the transmission of data streams with large packets. At present, voice data is also transmitted over an IP network. During transmission of voice data, small voice packets need to be transmitted in a real-time and reliable manner. When a voice packet is discarded during transmission, the packet cannot be transmitted again due to lack of time. In addition, the existence of such a voice packet is of no significance if the voice packet takes a long route and fails to arrive at the destination address in time when the voice packet needs to be played. Therefore, a voice packet is regarded as a lost packet if the voice packet fails to arrive at the destination address in time or does not arrive at the destination address in a Voice over Internet Protocol (VoIP) system.
Packet loss is the main reason for the deterioration of the service quality when the voice data is transmitted on the network. With the PLC technology, however, a lost packet is compensated with a synthetic packet to reduce the impact of packet loss on the voice quality during data transmission. Without an efficient voice PLC technology, the IP network cannot provide communication with the toll call quality even though the IP network is designed and managed with the highest standard. With a well-designed technology of solving the packet loss problem, the quality of voice transmission can be greatly improved. Therefore, different mechanisms in the existing technology are used to reduce the impact of packet loss. For example, the pitch waveform substitution serves as a basic PLC method.
The pitch waveform substitution is a processing technology that is implemented at the receiving end. With this technology, a lost data frame can be compensated on the basis of the voice characteristics. The principle, implementation process, and disadvantages of the pitch waveform substitution technology are described below.
In a voice signal, the surd waveform is disordered, but the sonant waveform is in periodic mode. The principle for pitch waveform substitution is as follows: First, the information about the frame before the lost frame, that is, the signal of the previous frame in the notch of waveform is adapted to estimate the pitch period (P) corresponding to the signal waveform before the notch. Then, a waveform at a length of P before the notch is adapted to compensate the notch of waveform.
With the existing technology, generally the autocorrelation analysis method is adopted to obtain the pitch period (P) that is used for pitch waveform substitution. Autocorrelation analysis is a common method of analyzing the voice time domain waveform that is defined by a correction function. The correction function is adapted to measure the affinity of time domains between signals. When two relevant signals are different, the value of the correction function approaches zero; when the waveforms of the two relevant signals are the same, the peak value appears before or after the waveform. Therefore, the autocorrelation function is adapted to research the signal itself, such as the synchronism and periodicity of the waveform.
However, existing methods for compensating a lost frame with a pitch waveform have the following disadvantages:
1) The pitch period (P) of sonant that is estimated by using the autocorrelation analysis method is not accurate. With the autocorrelation analysis method, the pitch period corresponding to the extreme value of auto-correction function serves as the final pitch period, which may be located in 1/N (N is an integer greater than 1) of frequency corresponding to the actual pitch period; in addition, the goal of estimating the pitch period is to obtain a pitch period of the data that is closest to the lost frame. However, a signal at least 22.5 ms (the corresponding pitch period is the minimum pitch period, that is, 2.5 ms) ahead of a notch must be used when the auto-correction method is adopted to calculate the pitch period. The preceding factors produce an error when the pitch period is calculated. When the pitch data with the error is adapted to fill in the data of a lost frame, the phase at the conjunction point has a sudden change.
2) With the existing technology, only the data before the lost frame, that is, the history data, is adapted to fill in the data of a lost frame. The pitch period in an audio signal is changed gradually. Therefore, the farther the data is from the lost frame, the weaker the correlation between the data and the lost frame becomes. When only the data before the lost frame is adapted to compensate the lost frame, the phase at the conjunction point of the lost frame and the frame after the lost frame may be incontinuous.
3) When the lost frame occurs during gradual change of the voice, the amplitude is incontinuous when only the data of previous pitch period of the lost frame is used for recovery.