1. Field of the Invention
The present invention relates to voice coding and decoding method and device. More particularly, it relates to a renewal code-excited linear prediction coding and decoding method and a device suitable for the method.
2. Description of the Related Art
FIG. 1 illustrates a typical code-excited linear prediction coding method.
Referring to FIG. 1, a predetermined term of 1 frame of N consecutive digitized samples of a voice to be analyzed is captured in step 101. Here, the 1 frame is generally 20 to 30 ms, which includes 160 to 240 samples when the voice is sampled at 8 kHz. In the preemphasis step 102, a high-pass filtering is performed to filter removes direct current (DC) components from voice data of one frame collected. In step 103, linear prediction coefficients (LPC) are calculated as(a.sub.1, a.sub.2, , . . . , a.sub.p). These coefficients are convolved with the sampled frame of speech; s(n), n=0,1, . . . , N. Also, included are the last p values of the preceding frame, which predict each sampled speech value such that the residual error can be ideally represented by codebook by a stochastic excitation function. To avoid larger residual errors due to truncation at the edges of the frame, s(n) the frame of points is multiplied by a Hamming window, w(n) n=0,1, . . . , N; to obtain the windowed speech frame s.sub.w (n) n=0,1, . . . , N. EQU s.sub.w (n)=s.sub.p (n)w(n) (1)
where, the weighting function w(n) is obtained by: ##EQU1##
The LPC coefficients are calculated such that they minimize the value of the equation 2. ##EQU2## where, EQU s(n)=a.sub.1 s.sub.w (n-1)+a.sub.2 s.sub.w (n-2)+ . . . +a.sub.p s.sub.w (n-p).
Before the obtained LPC coefficients, a.sub.1, are quantized and transmitted, they are converted into line spectrum pairs, w.sub.1, (hereinafter, referred to as LSP) coefficients, increasing the transmission efficiency and having an excellent subframe interpolation characteristic in an LPC/LSP converting step 104. The LSP coefficients are quantized in step 105. The quantized LSP coefficients are inverse-quantized to synchronize the coder with a decoder, in step 106.
A voice term is divided into S subframes to remove the periodicity of a voice from the analyzed voice parameters and model the voice parameters to a noise codebook, in step 107. Here, for convenience of explanation, the number of subframes S is restricted to 4. An i-th voice parameter s=0,1,2,3, i=1,2, . . . p) with respect to an s-th subframe can be obtained by the following equation 3. ##EQU3## where, w.sub.i (n-1) and w.sub.i (n) denote i-th LSP coefficients of a just previous frame and a current frame, respectively.
In step 108, the interpolated LSP coefficients are converted back into LPC coefficients. These subframe LPC coefficients are used to constitute a voice synthesis filter 1/A(z) and an error weighting filter A(z)/A(z/.gamma.) to be used in after steps 109, 110 and before step 112.
The voice synthesis filter 1/A(z) and the error weighting filter A(z)/A(z/.gamma.) are expressed as following equations 4 and 5. ##EQU4##
In step 109, influences of a synthesis filter of a just ##EQU5## previous frame are removed. A zero-input response (hereinafter called ZIR) S.sub.ZIR (n) can be obtained as following equation 6. Here, .sub.s (n) represents a signal synthesized in a previous subframe. The result of the ZIR is subtracted from an original voice signal s(n), and the result of the subtraction is called s.sub.d (n). ##EQU6##
Negative indexing of the equation 6, s.sub.ZIR (-n) address end values of the preceeding subframe. A codebook is searched and filtered by the error weight LPC filter 202 to find an excitation signal producing a synthetic signal closest to s.sub.dw (n), in adaptive codebook search 113 and a noise codebook search 114. The adaptive and noise codebook search processes will be described referring to FIGS. 2 and 3.
FIG. 2 shows the adaptive codebook search process, wherein the error weighting filter A(z)/A(z/.gamma.) at step 201 corresponding to equation 5 is applied to the signal s.sub.d (n) and the voice synthesis filter. Assuming that a signal which is resulted from applying the error weighting filter to the s.sub.d (n) is s.sub.dw (n) and an excitation signal formed with a delay of L by using the adaptive codebook 203 is P.sub.L (n), a signal filtered through step 202 is g.sub.a .cndot.p.sub.L '(n), and L* and g.sub.a minimizing the difference at step 204 between two signals are calculated by following equations 7 to 9. ##EQU7##
When an error signal from the thus-obtained L* and g.sub.a is set s.sub.ew (n), the value is expressed as following equation 10. EQU s.sub.ew (n)=s.sub.dw (n)-g.sub.a .multidot.p'.sub.L (n) (10)
FIG. 3 shows the noise codebook search process. Typically, the noise codebook consists of M predetermined codewords. If an i-th codeword c.sub.i (n) among the noise codewords is selected, the codeword is filtered in step 301 to become g.sub.r .cndot.c.sub.i '(n). An optimal codeword and a codebook 302 gain are obtained by following equations 11 to 13. EQU e(n)=s.sub.ew (n)-g.sub.r .multidot.c'.sub.i (n) (11)
A finally-obtained excitation signal of a voice filter is ##EQU8## given by: ##EQU9##
The result of equation 14 is utilized to renew the adaptive codebook for analyzing a next subframe.
The general performance of a voice coder depends on the time (processing delay or codec delay; unit ms) until a synthesis sound is produced after an analyzed sound is coded and decoded, the calculation amount (unit; MIPS (million instructions per second)), and the transmission rate (unit; kbit/s). Also, the codec delay depends on a frame length corresponding to the length of an input sound to be analyzed at a time during coding process. When the frame length is long, the codec delay increases. Thus, a difference in the performance of the coder according to the codec delay, the frame length and the calculation amount is generated between the coders operating at the same transmission rate.