ITU-T recommendation G.711 describes pulse code modulation (PCM) of 8000 Hz sampled voice (i.e., speech). In order to handle the packet loss inherent in the design of a voice-over-IP network, ITU-T adopted G.711 Appendix I (also known as “G.711 PLC”), which standardizes a high quality low-complexity algorithm for packet loss concealment with G.711. The G.711 PLC algorithm can be summarized as follows:
(a) During good frames (i.e., those properly received), a copy of the decoded output is saved in a circular buffer (known as a “pitch buffer”) and the output is delayed by 3.75 ms (i.e., 30 samples) before being sent to a playout buffer. Each frame is assumed to be 10 ms (i.e., 80 samples).
(b) If a frame is lost, the pitch period of the speech in the previous good frame is estimated based on a calculated normalized cross-correlation of the most recent 20 ms of speech in the pitch buffer. The pitch search range is between 220 Hz and 66 Hz.
(c) For the first 10 ms of erasure, the pitch period is repeated using a triangular overlap-add window at the boundary between the previously received material and the generated replacement material. For the next 10 ms of erasure, the last two pitch periods in the pitch buffer are alternately repeated, and at 20 ms of erasure, a third pitch period is added. This portion of the algorithm is used to minimize distortions due to packet boundaries which produce clicking noises, and to disrupt the correlation between frames, which produces an echo-like or robotic sound.
(d) For long erasures, the amplitude is attenuated at the rate of 20% per 10 ms. After 60 ms, the synthesized signal is zero (which may optionally be later replaced by a comfort noise as specified by ITU-T G.711 Appendix II).
The algorithmic complexity of G.711 PLC is approximately 0.5 of a DSP (Digital Signal Processor) MIPS (million instructions per second), or 500,000 instructions per second per channel. Although G.711 PLC is considered a “low complexity” approach to the packet loss concealment problem, its complexity level may nonetheless be prohibitive in terminals where very few MIPS are available, and expensive in larger switches that must, for example, dedicate a 100 MHz DSP chip for every 200 channels of capacity for concealment alone.
By contrast, an alternative “packet repetition” approach (familiar to those skilled in the art) in which previously received packets are simply repeated to fill the gap left by lost packets, is not nearly as complex, requiring only several hundred instructions (i.e., <0.001 MIPS). However, the resultant voice quality of the “packet repetition” approach is generally not equal to that of G.711 PLC.