The present invention relates to analysis-by-synthesis speech coding.
The applicant company has particularly described such speech coders, which it has developed, in its European patent applications 0 195 487, 0 347 307 and 0 469 997.
In an analysis-by-synthesis speech coder, linear prediction of the speech signal is performed in order to obtain the coefficients of a short-term synthesis filter modelling the transfer function of the vocal tract. These coefficients are passed to the decoder, as well as parameters characterising an excitation to be applied to the short-term synthesis filter. In the majority of present-day coders, the longer-term correlations of the speech signal are also sought in order to characterise a long-term synthesis filter taking account of the pitch of the speech. When the signal is voiced, the excitation in fact includes a predictable component which can be represented by the past excitation, delayed by TP samples of the speech signal and subjected to a gain g.sub.p. The long-term synthesis filter, also reconstituted at the decoder, then has a transfer function of the form 1/B(z) with B(z)=1-g.sub.p.z.sup.-TP. The remaining, unpredictable part of the excitation is called stochastic excitation. In the coders known as CELP ("Code Excited Linear Prediction") coders, the stochastic excitation consists of a vector looked up in a predetermined dictionary. In the coders known as MPLPC ("Multi-Pulse Linear Prediction Coding") coders, the stochastic excitation includes a certain number of pulses the positions of which are sought by the coder. In general, CELP coders are preferred for low data transmission rates, but they are more complex to implement than MPLPC coders.
In order to determine the long-term prediction delay, a closed-loop analysis is frequently used, contributing directly to minimising the perceptually weighted difference between the speech signal and the synthetic signal. The drawback of this closed-loop analysis is that it is demanding in terms of the amount of calculation, since the selection of a delay implies the evaluation of a certain number of candidate delays, and each evaluation of a delay requires calculations of products of convolution between the delayed excitation and the impulse response of the perceptually weighted synthesis filter. The above drawback also exists for the search for the stochastic excitation, which is also a closed-loop process in which products of convolution with this impulse response are involved. The excitation varies more rapidly than the spectral parameters characteristic of the short-term synthesis filter. The excitation (predictable and stochastic) is typically determined once per 5 ms sub-frame, whereas the spectral parameters are determined once per 20 ms frame. The complexity and the frequency of the closed-loop search for the excitation make this stage the most critical one as far as the speed of the necessary calculations in a speech coder is concerned.
A main object of the invention is to propose a speech coding method of reduced complexity as far as the closed-loop analysis or analyses are concerned.