The present invention relates to a CELP (Code Excited Linear Prediction) type speech coding apparatus which encodes a speech signal to transmit in, for example, a mobile communication system.
Used in the fields of digital mobile communications and speech storage are speech coding apparatuses which compress speech information to encode with high efficiency for utilization of radio signals and recording media. Among them, the system based on a CELP (Code Excited Linear Prediction) system is carried into practice widely for the apparatuses operating at medium to low bit rates. The technology of the CELP is described in xe2x80x9cCode-Excited Linear Prediction (CELP):High-quality Speech at Very Low Bit Ratesxe2x80x9d by M. R. Schroeder and B. S. Atal, Proc. ICASSP-85, 25.1.1., pp.937-940, 1985.
In the CELP type speech coding system, speech signals are divided into predetermined frame lengths (about 5 ms to 50 ms), linear prediction of the speech signals is performed for each frame, the prediction residual (excitation vector signal) obtained by the linear prediction for each frame is coded using an adaptive code vector and random code vector comprised of known waveforms.
The adaptive code vector is selected for use from an adaptive codebook storing previously generated excitation vectors, and the random code vector is selected for use from a random codebook storing a predetermined number of pre-prepared vectors with predetermined shapes.
In particular, used as the random code vectors stored in the random codebook are, for example, random noise sequence vectors and vectors generated by arranging a few pulses at different positions. In particular, one of representative examples of the latter is CS-ACELP (Conjugate Structure and Algebraic CELP) recommended as an international standard by ITU-T in 1996. The technology of the CS-ACELP is described in xe2x80x9cRecommendation G.729:Coding of Speech at 8 kbit/s using Conjugate-Structure Algebraic-Code-Excited Linear-Prediction (CS-ACELP)xe2x80x9d, March 1996.
In addition, the CS-ACELP uses an algebraic codebook as a random codebook. The random code vector generated from the algebraic codebook in the CS-ACELP is such a vector that four impulses each with an amplitude of xe2x88x921 or +1 are prepared (regions other than positions with the four prepared pulses are basically all 0)in 40 samples (5 ms) per subframe basis. Since an absolute value of the amplitude is fixed to 1, it is enough to represent only a position and polarity (positive or negative) of each pulse to represent an excitation vector. Therefore it is not necessary to store a vector with 40 dimensions (subframe length) in a codebook, and a memory for codebook storage is not required. Further since four pulses with amplitudes of 1 are only present in the vector, this method has futures such that the computation amount for codebook search is largely reduced.
In the CS-ACELP, adaptive code vector information is efficiently coded by representing a pitch of a second subframe by performing quantization on a pitch differential value using a pitch of a first subframe. Further in the pitch search, a constitution is adopted in which one pitch candidate is selected by open loop pitch search for each frame, and closed loop pitch search for each subframe is performed around the pitch candidate, whereby it is designed to also reduce the computation amount required for the search.
Herein a conventional CS-ACELP coding apparatus is specifically explained with reference to FIG. 1. FIG. 1 illustrates a basic configuration of the conventional CS-ACELP speech coding apparatus. In FIG. 1, input buffer 1 performs buffering of data with a required length while updating an input digital speech signal for each frame, and outputs required data to subframe divider 2, LPC analyzer 3, and weighted synthesis filter 4.
Subframe divider 2 divides a frame of the input digital signal, input from input buffer 1, into two subframes, outputs a first subframe signal to first target calculator 5, and further outputs a second subframe signal to second target calculator 6. LPC analyzer 3 receives a digital speech signal required for analysis input from input buffer 1 to perform LPC analysis, and outputs linear predictive coefficients to LPC quantizer 7 and second LPC interpolator 8. Weighted synthesis filter 4 receives as inputs the frame of the digital speech signal input from input buffer 1 and linear predictive coefficients a1 and a2 output from second LPC interpolator 8, and performs perceptual weighting on the input speech signal to output to open loop pitch searcher 9.
LPC quantizer 7 performs quantization on the linear predictive coefficients output from LPC analyzer 3, outputs quantized LPC to first LPC interpolator 10, and at the same time outputs coding data L of the quantized LPC to a decoder. Second LPC interpolator 8 receives as inputs the LPC output from LPC analyzer 3, performs interpolation on LPC of the first subframe, and outputs unquantized LPC of the first and second subframes respectively as a1 and a2. First LPC interpolator 10 receives as inputs the quantized LPC output from LPC quantizer 7, performs interpolation on the quantized LPC of the first subframe, and outputs quantized LPC of the first and second subframes respectively as qa1 and qa2.
First target calculator 5 receives as inputs the first subframe of the digital speech signal divided in subframe divider 2, filter state st1 output from second filter state updator 11 on the last second subframe, and qa1 and a1 that are respectively the quantized LPC and unquantized LPC of the first subframe, and calculates a target vector to output to first closed loop pitch searcher 12, first target updator 13, first gain codebook searcher 14, and first filter state updator 15. Second target calculator 6 receives as inputs the second subframe of the digital speech signal output from subframe divider 2, filter state st2 output from first filter state updator 15 on the first subframe of a current frame, and qa2 and a2 that are respectively the quantized LPC and unquantized LPC of the second subframe, and calculates a target vector to output to second closed loop pitch searcher 16, second target updator 17, second gain codebook searcher 18, and second filter state updator 11.
Open loop pitch searcher 9 receives as an input a weighted input speech signal output from weighted synthesis filter 4 to extract a pitch periodicity, and outputs an open loop pitch period to first closed loop pitch searcher 12. First closed loop pitch searcher 12 receives a first target vector, open loop pitch, adaptive code vector candidates, and an impulse response vector respectively input from first target calculator 5, open loop pitch searcher 9, adaptive codebook 19, and first impulse response calculator 20, performs closed loop pitch search around the open loop pitch, outputs closed loop pitch P1 to second closed loop pitch searcher 16, first pitch period processing filter 21 and the decoder, outputs an adaptive code vector to first excitation generator 22, and further outputs a synthetic vector obtained by performing convolution of the first impulse response and the adaptive code vector to first target updator 13, first gain codebook searcher 14, and first filter state updator 15.
First target updator 13 receives the first target vector and a first adaptive code synthetic vector respectively input from first target calculator 5 and first closed loop pitch searcher 12, and calculates a target vector for the random codebook to output to first random codebook searcher 23. First gain codebook searcher 14 receives the first target vector, the first adaptive code synthetic vector, and a first random code synthetic vector respectively input from first target calculator 5, first closed loop pitch searcher 12 and first random codebook searcher 23, and selects an optimum quantized gain from gain codebook 29 to output to first excitation generator 22 and first filter state updator 15.
First filter state updator 15 receives the first target vector, first adaptive code synthetic vector, first random code synthetic vector, and a first quantized gain respectively input from first target calculator 5, first closed loop pitch searcher 12, first random codebook searcher 23 and first gain codebook searcher 14, updates a state of a synthesis filter, and outputs filter state st2. First impulse response calculator 20 receives as inputs a1 and qa1 that are respectively unquantized LPC and quantized LPC of the first subframe, and calculates an impulse response of a filter constructed by connecting a perceptual weighting filter and the synthesis filter, to output to first closed loop pitch searcher 12 and first pitch period processing filter 21.
First pitch period processing filter 21 receives a first closed loop pitch and first impulse response vector respectively input from first closed loop pitch searcher 12 and first impulse response calculator 20, and performs pitch period processing on the first impulse response vector to output to first random codebook searcher 23. First random codebook searcher 23 receives as inputs an updated first target vector output from first target updator 13, a period processed first impulse response vector output from first pitch period processing filter 21, and random code vector candidates output from random codebook 24, selects an optimum random code vector from random codebook 24, outputs a vector obtained by performing period processing on the selected random code vector to first excitation generator 22, outputs a synthetic vector obtained by performing convolution of the period processed first impulse response vector and the selected random code vector to first gain codebook searcher 14 and first filter state updator 15, and outputs code S1 representative of the selected random code vector to the decoder.
Random codebook 24 stores a predetermined number of random code vectors with the predetermined shapes, and outputs a random code vector to first random codebook searcher 23 and second random codebook searcher 25.
First excitation generator 22 receives the adaptive code vector, random code vector, and quantized gains respectively input from first closed loop pitch searcher 12, first random codebook searcher 23 and first gain codebook searcher 14, generates an excitation vector, and outputs the generated excitation vector to adaptive codebook 19. Adaptive codebook 19 receives as an input the excitation vector alternately output from first excitation generator 22 and second excitation generator 26 to update the adaptive codebook, and outputs an adaptive codebook candidate alternately to first closed loop pitch searcher 12 and second closed loop pitch searcher 16. Gain codebook 29 stores pre-prepared quantized gains (adaptive code vector component and random code vector component) to output to first gain codebook searcher 14 and second gain codebook searcher 18.
Second closed loop pitch searcher 16 receives a second target vector, pitch of the first subframe, adaptive code vector candidates, and impulse response vector respectively input from second target calculator 6, first closed loop pitch searcher 12, adaptive codebook 19, and second impulse response calculator 27, performs the closed loop pitch search around the pitch of the first subframe, outputs closed loop pitch P2 to second pitch period processing filter 28 and the decoder, outputs the adaptive code vector to second excitation generator 26, and outputs a synthetic vector obtained by performing convolution of the second impulse response and the adaptive code vector to second target updator 17, second gain codebook searcher 18 and second filter state updator 11.
Second target updator 17 receives the second target vector and second adaptive code synthetic vector respectively input from second target calculator 6 and second closed loop pitch searcher 16, and calculates the target vector for the random codebook to output to second random codebook searcher 25. Second gain codebook searcher 18 receives the second target vector, second adaptive code synthetic vector and second random code synthetic vector respectively input from second target calculator 6, second closed loop pitch searcher 16 and second random codebook searcher 25, and selects an optimum quantized gain from gain codebook 29 to output to second excitation generator 26 and second filter state updator 11.
Second filter state updator 11 receives the second target vector, second adaptive code synthetic vector, second random code synthetic vector, and second quantized gain respectively input from second target calculator 6, second closed loop pitch searcher 16, second random codebook searcher 25, and second gain codebook searcher 18, updates the state of the synthesis filter, and outputs filter state st1.
Second impulse response calculator 27 receives as inputs a2 and qa2 that are respectively unquantized LPC and quantized LPC of the second subframe, and calculates the impulse response of the filter constructed by connecting the perceptual weighting filter and the synthesis filter, to output to second closed loop pitch searcher 16 and second pitch period processing filter 28. Second pitch period processing filter 28 receives a second closed loop pitch and second impulse response vector respectively input from second closed loop pitch searcher 16 and second impulse response calculator 27, and performs pitch period processing on the second impulse response vector to output to second random codebook searcher 25.
Second random codebook searcher 25 receives as inputs an updated second target vector output from second target updator 17, a period processed second impulse response vector output from second pitch period processing filter 28, and the random code vector candidates output from random codebook 24, selects an optimum random code vector from random codebook 24, outputs a vector obtained by performing the period processing on the selected random code vector to second excitation generator 26, outputs a synthetic vector obtained by performing convolution of the period processed second impulse response vector and the selected random code vector to second gain codebook searcher 18 and second filter state updator 11, and outputs code S2 representative of the selected random code vector to the decoder. Second excitation generator 26 receives the adaptive code vector, random code vector, and quantized gains respectively input from second closed loop pitch searcher 16, second random codebook searcher 25 and second gain codebook searcher 18, generates an excitation vector, and outputs the generated excitation vector to adaptive codebook 19.
In addition, LPC data L, pitches P1 and P2, random code vector data S1 and S2, and gain data G1 and G2 are coded to be bit streams, transmitted through the transmission path, and then output to the decoder. LPC data L is output from LPC quantizer 7. Pitch P1 is output from first closed loop pitch searcher 12. Random code vector data S1 is output from first random codebook searcher 23. Gain data G1 is output from first gain codebook searcher 14. Pitch P2 is output from second closed loop pitch searcher 16. Random code vector data S2 is output from second random codebook searcher 25. Gain data G2 is output from second gain codebook searcher 18. The processing on the second subframe is performed after all the processing on the first subframe is finished. The pitch differential value is quantized on the pitch of the second subframe using the pitch of the first subframe.
The following explains the operation of the CS-ACELP speech coding apparatus with the above-mentioned configuration with reference to FIG. 1. First, in FIG. 1, a speech signal is input to input buffer 1. Input buffer 1 updates an input digital speech signal to be coded per frame (10 ms) basis, and provides required buffering data to subframe divider 2, LPC analyzer 3 and weighted synthesis filter 4.
LPC analyzer 3 performs linear predictive analysis using data provided from input buffer 1, and calculates linear predictive coefficients (LPC) to output to LPC quantizer 7 and second LPC interpolator 8. LPC quantizer 7 converts the LPC into LSP to perform quantization, and outputs quantized LSP to first LPC interpolator 10. First LPC interpolator 10 adopts input quantized LSP as quantized LSP of the second subframe, and interpolates quantized LSP of the first subframe with linear interpolation using the quantized LSP of the second subframe of a last frame.
Obtained quantized LSP of the first and second subframes are converted into LPC, and respectively output as quantized LPC qa1 and qa2. Second LPC interpolator 8 converts input unquantized LPC into LSP, interpolates LSP of the first subframe in the same way as in first LPC interpolator 10, determines LSP of the first and second subframes to convert to LPC, and outputs a1 and a2 as unquantized LPC.
Weighted synthesis filter 4 receives a frame (10 ms) of a digital data sequence to be quantized from input buffer 1. Weighted synthesis filter 4, constructed with unquantized LPC a1 and a2, performs filtering on the frame data, and thereby calculates a weighted input speech signal to output to open loop pitch searcher 9.
Open loop pitch searcher 9 buffers previously generated weighted input speech signals, obtains a normalized auto-correlation function from a data sequence to which a newly generated weighted input speech signal is added, and based on the function, extracts a period of the weighted input speech signal. The extracted period is output to first closed loop pitch searcher 12.
Subframe divider 2 receives a frame of the digital signal sequence to be coded input from input buffer 1, divides the frame into two subframes, provides a first subframe (former subframe in time) to first target calculator 5, and further provides a second subframe (latter subframe in time) to second target calculator 6.
First target calculator 5 constructs a quantized synthesis filter and weighted synthesis filter using quantized LPC qa1 and unquantized LPC a1 of the first subframe, calculates a weighted input speech signal (target vector) from which a zero input response of the quantized synthesis filter is removed using filter state st1 obtained in second filter state updator 11 on the second subframe of the last frame, and outputs the target vector to first closed loop pitch searcher 12, first target vector updator 13, first gain codebook searcher 14 and first filter state updator 15.
First impulse response calculator 20 obtains an impulse response of the filter obtained by connecting the quantized synthesis filter constructed with quantized LPC qa1 and the weighted synthesis filter constructed with unquantized LPC a1 to output to first closed loop pitch searcher 12 and first pitch period processing filter 21. First closed loop pitch searcher 12 performs convolution of the first impulse response and the adaptive code vector retrieved from adaptive codebook 19, thereby calculates a weighted synthetic speech vector (adaptive codebook component), and extracts a pitch that generates such an adaptive code vector that minimizes an error between the calculated vector and the first target vector. The pitch search at this point is performed around the open loop pitch input from open loop pitch searcher 9.
The adaptive code vector generated with the obtained pitch is output to first excitation generator 22 to be used to generate an excitation vector, and a first adaptive code synthetic vector generated by performing the convolution of the impulse response and the adaptive code vector is output to first target updator 13, first gain codebook searcher 14, and first filter state updator 15. First target updator 13 subtracts the product, obtained by multiplying the first adaptive code synthetic vector output from first closed loop pitch searcher 12 by an optimum gain, from the first target vector output from first target calculator 5, thereby calculates a target vector for the first random codebook search, and outputs the calculated target vector to first random codebook searcher 23.
First random codebook searcher 23 performs convolution of the pitch period processed first impulse response, input from first pitch period processing filter 21, and the random code vector retrieved from random codebook 24, thereby calculates a weighted synthetic speech vector (random codebook component), and selects a random code vector that minimizes an error between the calculated vector and the target vector for the first random codebook. The selected random code vector is subjected to period processing by the pitch period processing filter, and output to first excitation generator 22 to be used in generating an excitation vector. Further the first random code synthetic vector generated by performing the convolution of the pitch period processed impulse response and the random code vector is output to first gain codebook searcher 14 and first filter state updator 15.
First pitch period processing filter 21 performs filtering on the impulse response input from first impulse response calculator 20 according to the following equation 1, and outputs the resultant to first random codebook searcher 23:
x(n)=x(n)+xcex2xc3x97x(nxe2x88x92T), nxe2x89xa7Txe2x80x83xe2x80x83eq.1
where x(n)is input data, n=0, 1, . . . , 39 (subframe length xe2x88x921), T is pitch period, and xcex2 is pitch predictor gain.
Pitch period T used in this filter is P1 input from first closed loop pitch searcher 12. First gain codebook searcher 14 receives the first target vector, first adaptive code synthetic vector, and first random code synthetic vector respectively input from first target calculator 5, first closed loop pitch searcher 12 and first random codebook searcher 23, and selects a combination of a quantized adaptive code gain and quantized random code gain, which minimizes the square error between the first target vector and a vector of the sum of the first adaptive code synthetic vector multiplied by the quantized adaptive code gain and the first random code synthetic vector multiplied by the quantized random code gain, from gain codebook 29.
Selected quantized gains are output to first excitation generator 22 and first filter state updator 15 to be used in generation of the excitation vector and state update of the synthesis filter. First excitation generator 22 multiplies the adaptive code vector input from first closed loop pitch searcher 12, and the pitch period processed random code vector input from first random codebook searcher 23, respectively by the quantized gain (adaptive codebook component) and another quantized gain (random codebook component) input from first gain codebook searcher 14, and adds the adaptive code vector and random code vector each multiplied by the respective quantized gain to generate the excitation vector for the first subframe.
The generated first subframe excitation vector is output to the adaptive codebook to be used in update of the adaptive codebook. First filter state updator 15 updates the state of the filter constructed by connecting the quantized synthesis filter and weighted synthesis filter. The filter state is obtained by subtracting the sum of the adaptive code synthetic vector multiplied by the quantized gain (adaptive codebook component) and the random code synthetic vector multiplied by the another quantized gain (random codebook component) from the target vector input from first target calculator 5. The obtained filter state is output as st2, used as the filter state for the second subframe, and used in second target calculator 6.
Second target calculator 6 constructs the quantized synthesis filter and weighted synthesis filter using qa2 and a2 that are respectively the quantized LPC and unquantized LPC of the second subframe, calculates the weighted input speech signal (target vector) from which the zero input response of the quantized synthesis filter is removed using filter state st2 obtained in first filter state updator 15 on the first subframe, and outputs the second target vector to second closed loop pitch searcher 16, second target vector updator 17, second gain codebook searcher 25 and second filter state updator 11.
Second impulse response calculator 27 obtains the impulse response of the filter obtained by connecting the quantized synthesis filter constructed with quantized LPC qa2 and the weighted synthesis filter constructed with unquantized LPC a2 to output to second closed loop pitch searcher 16 and second pitch period processing filter 28. Second closed loop pitch searcher 16 performs the convolution of the second impulse response and the adaptive code vector retrieved from adaptive codebook 19, thereby calculates a weighted synthetic speech vector (adaptive codebook component), and extracts a pitch that generates such an adaptive code vector that minimizes an error between the calculated vector and the second target vector. The pitch search at this point is performed around pitch P1 of the first subframe input from first closed loop pitch searcher 12.
The adaptive code vector generated with the obtained pitch is output to second excitation generator 26 to be used to generate the excitation vector, and the second adaptive code synthetic vector generated by performing the convolution of the impulse response and the adaptive code vector is output to second target updator 17, second gain codebook searcher 18, and second filter state updator 11. Second target updator 17 subtracts the product, obtained by multiplying the second adaptive code synthetic vector output from second closed loop pitch searcher 16 by an optimum gain, from the second target vector output from second target calculator 6, thereby calculates the target vector for the second random codebook search, and outputs the calculated target vector to second random codebook searcher 25.
Second random codebook searcher 25 performs the convolution of the pitch period processed second impulse response input from second pitch period processing filter 28 and the random code vector retrieved from random codebook 24, thereby calculates a weighted synthetic speech vector (random codebook component), and selects a random code vector that minimizes an error between the calculated vector and the target vector for the second random codebook. The selected random code vector is subjected to period processing by the second pitch period processing filter, and output to second excitation generator 26 to be used in generating an excitation vector.
Further the second random code synthetic vector generated by performing the convolution of the pitch period processed impulse response and the random code vector is output to second gain codebook searcher 18 and second filter state updator 11. Second pitch period processing filter 28 performs filtering on the impulse response input from second impulse response calculator 27 according to the previously mentioned equation 1 where x(n)is input data, n=0, 1, . . . ,39(subframe length xe2x88x921), T is pitch period, and xcex2 is pitch predictor gain, and outputs the resultant to second random codebook searcher 25.
Pitch period T used in this filter is P2 input from second closed loop pitch searcher 16. Second gain codebook searcher 18 receives the second target vector, second adaptive code synthetic vector, and second random code synthetic vector respectively input from second target calculator 6, second closed loop pitch searcher 16 and second random codebook searcher 25, and selects a combination of a quantized adaptive code gain and quantized random code gain, which minimizes the square error between the second target vector and a vector of the sum of the second adaptive code synthetic vector multiplied by the quantized adaptive code gain and the second random code synthetic vector multiplied by the quantized random code gain, from gain codebook 29.
Selected quantized gains are output to second excitation generator 26 and second filter state updator 11 to be used in generation of the excitation vector and state update of the synthesis filter. Second excitation generator 26 multiplies the adaptive code vector input from second closed loop pitch searcher 16, and the pitch period processed random code vector input from second random codebook searcher 25, respectively by the quantized gain (adaptive codebook component) and another quantized gain (random codebook component) output from second gain codebook searcher 18, and adds the adaptive code vector and random code vector each multiplied by the respective quantized gain to generate the excitation vector for the second subframe. The generated second subframe excitation vector is output to adaptive codebook 19 to be used in update of the adaptive codebook.
Second filter state updator 11 updates the state of the filter constructed by connecting the quantized synthesis filter and weighted synthesis filter. The filter state is obtained by subtracting the sum of the adaptive code synthetic vector multiplied by the quantized gain (adaptive codebook component) and the random code synthetic vector multiplied by the another quantized gain (random codebook component) from the target vector output from second target calculator 6. The obtained filter state is output as st1, used as the filter state for the first subframe of a next frame, and used in first target calculator 5. In addition adaptive codebook 19 buffers excitation signals, generated in first excitation generator 22 and second excitation generator 26, sequentially in time, and stores the excitation signals generated previously with lengths required for the closed loop pitch search.
The update of the adaptive codebook is performed once for each subframe, while shifting a buffer corresponding to a subframe in the adaptive codebook, and then copying a newly generated excitation signal at the last portion of the buffer. In addition among the two signals divided in subframe divider 2 to be quantized, coding processing on the first subframe is first performed, and after the coding processing on the first subframe is completely finished, the coding processing on the second subframe is performed. Pitch P2 output on the second subframe is subjected to the quantization of the pitch differential value using pitch P1 output on the first subframe, and transmitted to a decoder side.
After the processing on one frame is finished, LPC data L, pitches P1 and P2, random code vector data S1 and S2, and gain data G1 and G2 are coded to be bit streams, transmitted through the transmission path, and then output to the decoder. LPC data L is output from LPC quantizer 7. Pitch P1 is output from first closed loop pitch searcher 12. Random code vector data S1 is output from first random codebook searcher 23. Gain data G1 is output from first gain codebook searcher 14. Pitch P2 is output from second closed loop pitch searcher 16. Random code vector data S2 is output from second random codebook searcher 25. Gain data G2 is output from second gain codebook searcher 18.
However in the above-mentioned conventional speech coding apparatus, since a single pitch candidate is only selected by the open loop pitch search, there is a problem that a pitch finally determined is not always an optimum one. To solve the problem, it is considered to output two or more pitch candidates, and perform the closed loop pitch search on the candidates. However in the above-mentioned coding apparatus, since the pitch differential value between subframes is quantized, there is another problem that an optimum pitch only for a first subframe may be selected.
It is an object of the present invention to improve accuracy of pitch search (adaptive codebook search) in a speech coding apparatus that performs quantization on a differential value of pitch information between subframes, without providing adverse effects on the quantization of pitch differential value.
It is a subject of the present invention to output a plurality of pitch candidates when a plurality of effective pitch candidates are present in the frame pitch search. That is, the present invention provides a CELP type speech coding apparatus provided with a pitch candidate selection section that performs preliminary selection of pitch for the adaptive codebook on a subframe, among the subframes obtained by dividing unit frame, on which the pitch differential value for the adaptive codebook is not quantized, and selects at least one pitch candidate adaptively.