The present invention generally relates to a digital speech encoder having a long term filter in which delay (lag) is a parameter. This invention is particularly, but not exclusively, suited for use in a code-excited linear prediction (CELP) speech encoder.
In a CELP encoder, long term and short term filters are excited by an excitation vector selected from a table of such vectors. The speech is represented in a CELP encoder by an excitation vector, lag and gain parameters associated with the long term filter, and a set of parameters associated with the short term filter. These parameters are transmitted to the receiver which produces a representation of the original speech based upon these parameters.
The long term filter lag L can be determined from either an open loop or closed loop method. In the open loop method, the lag is determined directly from the input signal in the transmitter. The lag can be determined to be the delay that achieves the greatest value of a normalized autocorrelation function. The autocorrelation function must be calculated for each lag that is tested.
A variation of the open loop method which requires less computational loading comprises finding the maximum normalized autocorrelation of a decimated speech signal. Since fewer samples are tested, less computations are required. The delay of the decimated signal is multiplied by the decimation factor to obtain a delay value that corresponds to the undecimated signal. The lag found by this method has less resolution since it is based on a decimated signal. Greater resolution can be obtained by testing lags adjacent the computed undecimated lag. See Juin-Hwey Chen and Allen Gersho, "Real-Time Vector APC Speech Coding at 4800 BPS with Adaptive Postfiltering", Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Vol. 4, pp 2185-2188, April 1987.
In a closed loop method of determining the lag, trial lags and gains of the long term filter are tested to minimize the mean square of the weighted error between the speech signal and the output of the cascaded long term and short term filters. This approach attempts to find a match between the coded data in the delay line of the long term filter and the input signal. The long term lag and gain determination is based on the actual long term filter state that will exist at the receiver where speech is synthesized. Hence, the closed loop method achieves better resolution than the open loop method but at the cost of significantly more computations.