The term "wideband" is used in the speech coding field to indicate that the signal to be coded has a bandwidth greater than the about 3 kHz bandwidth of the conventional telephone band, in particular a band between about 50 Hz and 7 kHz. The use of a wider band than the conventional telephone band allows a higher quality of the coded signals to be obtained, as required or desired for certain services offered by integrated service digital networks, such as audioconference, videophone, commentary channels, etc., and also for cordless telephones.
In cases in which the coded signal must be transmitted at relatively low bit rates (for example 16-32 kbits/s), the use of the analysis-by-synthesis coding technique has already been suggested. This technique gives the highest coding gains at these rates. In particular, the paper "Experiments on 7 kHz audio coding at 16 kbits/s", presented by R. Drogo de Iacovo et al. at ICASSP '89, Glasgow (UK), 23-26 May 1989, paper S4.19, and European Patent Application EP-A-0 396 121, disclose a system in which the signal to be coded is divided into two sub-bands whose signals are coded at the same time, and examples are supplied of coders in which a multipulse excitation or an excitation consisting of vectors selected in an appropriate codebook (CELP=Codebook Excited Linear Prediction technique) is exploited.
In this known system, the coders of the two sub-bands operate on sample groups or frames with a 15-20 ms duration, and this clearly implies a coding delay at least equal to the duration of the frames themselves. For certain applications, such as cordless telephones, audiographic conferences, etc., it is essential to have a low-coding delay, so as to reduce the effects of acoustical and electrical echoes. To obtain the low delay, in schemes such as that shown in this European Patent Application, one cannot resort only to the use of very short frames (a few ms), because this would necessitate frequent updating of coding parameters, with a consequent increase in information to be transmitted to the decoder and therefore in the bit rate.
To realize low-delay coders using short-duration frames, without increasing the bit rate, it has been suggested to use CELP techniques in which the spectral parameters are computed starting from the signal reconstructed at the transmitter ("backward" CELP technique). According to these techniques, for each frame, the prediction units receive the set of parameters determined in the previous frame, estimate at each new sample a possible updated value of parameters, and supply as actual values those estimated after receiving the last sample. An example of this type of low-delay coder is described in the CCITT draft Recommendation G728 "Coding of Speech at 16 kbit/s Using Low-Delay Code Excited Linear Prediction" and in the paper "High-quality 16 kb/s speech coding with a one-way delay less than 2 ms", presented by J. H. Chen at ICASSP '90, Albuquerque (USA), April 3-6, paper S9.1. In this coder, designed for coding audio signals with the conventional telephone band, backward adaptation techniques are used to update predictor coefficients in the synthesis filters (comprising only short-term predictors) and the gain with which excitation vectors are scaled In particular, predictor coefficients of the synthesis filters are updated by means of an LPC analysis of the previously quantized speech; the coefficients of the weighting filters are updated by means of an LPC analysis of the input signal; and the vector gain is updated by using the gain information incorporated in the previously quantized excitation. In this way only the index of the word in the codebook (structured in excitation gain and shape) must be transmitted, since the predictor coefficients of the synthesis filter and the backward adapted gain can be determined in the receiver by backward adaptation circuits similar to those used in the transmitter.
The quality loss which could occur as a result of dispensing with a long-term predictor is compensated for by the use of a relatively high prediction order for the short-term predictors, in particular a prediction order equal to 50. In any case, the short-term prediction order cannot be raised beyond a certain limit for reasons of computation complexity.
In the case of sub-band coding, the use of different prediction orders in the different sub-bands has been suggested. In particular, in the coder described in the said paper by R. Drogo de Iacovo et al. (in which long-term correlations are exploited) filters with prediction order 10 for the lower sub-band and order 4 for the upper sub-band are used. These prediction orders are fixed. Good results are obtained in this way for actual speech, but not for signals with highly variable characteristics, such as music.