1. Field of the Invention
The present invention relates to voice decoding, and more particularly, to an apparatus and method for concealing frame erasure by which a voice signal can be restored with concealing frame erasure by using regression analysis when voice decoding is performed, and a voice decoding apparatus and method using the same.
2. Description of Related Art
In order to enable data transmission even under a transmission environment in which a bandwidth is limited, instead of directly transmitting a voice signal, recent voice encoding apparatuses extract parameters representing a voice signal, encode the extracted parameters, and generate a bitstream including the encoded parameters. A voice decoding apparatus decodes parameters included in the received bitstream, and by using the decoded parameters, generates a restored voice signal.
The conventional voice decoding apparatus uses a method based on correlation of a voice signal adjacent to an erased frame that occurs in a received packet in order to conceal the erased frame. Algorithms based on an extrapolation method in which parameters of a previous good frame are used to obtain the parameters of the erased frame, and an interpolation method in which parameters of a next good frame are used to obtain the parameters of the erased frame are mainly used. However, since the erased frame lowers the sound quality by the erased interval, and in addition damages long interval prediction memory data, errors are propagated, even to the following frames. As a result, even though the voice reception apparatus again receives valid packets after losing packets, the sound degradation continues because of the use of damaged data stored in the long interval prediction memory. Accordingly, there is a limit in solving this sound quality degradation and error propagation problems with the conventional algorithm.
Meanwhile, the concealment algorithm of ITU-T G.729 that is widely used in the voice over Internet protocol (VoIP) application fields together with G. 723.1, obtains spectrum information and excitement signal information of voice by using code excited linear prediction (CELP) algorithm based on a spoken voice model. When the CELP algorithm is applied, the voice encoding parameters of an erased frame are estimated by using the excitement signal and spectrum information of a most recent good frame. In this process, the energy of the excitement signal corresponding to the erased frame is gradually reduced so that its effect on packet loss can be minimized. However, the reducing of the energy of the excitement signal results in degradation of the sound quality.