1. Field of the Invention
The present invention relates to a speech restoration system and method for concealing packet losses, and more particularly, to a speech restoration system and method for concealing packet losses when decoding a signal coded by a conventional speech coder.
2. Description of the Related Art
Conventional speech receiving apparatuses use the relationship between a received packet and an adjacent voice signal to conceal packet losses. In general, when packet losses occur, standard speech coders use an extrapolation method-based algorithm that extrapolates coding parameters related to a last-received valid frame before a lost frame, or use a repetition method-based algorithm that repeatedly uses a last-received valid frame before a lost frame. However, a lost packet not only lowers the quality of voice in a section including the lost packet but also causes a loss in data of a long-period prediction memory. As a result, an error in the lost packet may propagate to a next frame. Therefore, even if a speech receiving apparatus receives available packets after the packet losses, the apparatus will use damaged data stored in the long-period prediction memory during a decoding process, resulting in degradation of the voice quality. Accordingly, conventional algorithm adopted by conventional speech decoders is limited by a reduction in the quality of voice and the propagation of an error to a next frame.
The ITU-T G.729 speech coder and G.723.1 are both commonly used in a Voice over Internet Protocol (VoIP) application. The ITU-T G.729 compresses or decompresses input voice at a rate of 8 kbit/s and provides toll quality speech. More specifically, G.729 quantizes spectrum information and excitation signal information using a Code Excited Linear Prediction (CELP) algorithm which is based on a LP speech production model. A packet loss concealing algorithm used in G.729 estimates speech coding parameters in a lost frame using an excitation signal of the last-received valid frame and spectrum information regarding the last-received valid frame when detecting lost packets. During the prediction, the energy of the excitation signal corresponding to the lost frame is gradually decreased to minimize the effects of the packet loss.
If an nth frame is determined to be a lost frame, a spectrum parameter of an n−1th frame, which is the last-received valid frame before the lost frame, is used to replace that of the lost frame. In other words, G.729 estimates a linear prediction coefficient of the lost frame by repeating the linear prediction coefficient of previous valid frame, and then, an adaptive codebook gain and a fixed codebook gain are replaced with a gain of a last-received valid frame that is reduced by a predetermined factor. Also, to prevent the excessive periodicity of concealed voice, the adaptive codebook is delayed by increasing a delay in the previous frame by 1. However, a reduction in the rate of parameters or repetitive use of the parameters unstabilizes the feedback of the energy of decoded voice, and further remarkably lowers the quality of voice when frame losses continuously occur.