The present invention concerns a quantization process for a predictor filter for vocoders of very low bit rate.
It concerns more particularly linear prediction vocoders similar to those described for example in the Technical Review THOMSON-CSF, volume 14, no.degree. 3, September 1982, pages 715 to 731, according to which the speech signal is identified at the output of a digital filter of which the input receives either a periodic waveform, corresponding to voiced sounds such as vowels, or a variable waveform corresponding to unvoiced sounds such as most consonants.
It is known that the auditory quality of linear prediction vocoders depends heavily on the precision with which their predictor filter is quantified and that this quality decreases when the data rate between vocoders deceases because the precision of filter quantization then becomes insufficient. Generally, the speech signal is segmented into independent frames of constant duration and the filter is renewed at each frame. Thus, to reach a rate of about 1820 bits per second, it is necessary, according to a normalized standard embodiment, to represent the filter by a 41-bit packet transmitted every 22.5 milliseconds. For non-standard links of lower bit rate of the order of 800 bits per second, less than 800 bits per second must be transmitted to represent the filter, in other words a data rate three times lower than in standard embodiments. Nevertheless, to obtain a satisfactory precision of the predictor filter, the classic approach is to implement the vectorial quantization method which is intrinsically more efficient than that used in standard systems where the 41 bits implemented enable scalar quantization of the P=10 coefficients of their predictor filters. The method is based on the use of a dictionary containing a known number of standard filters obtained by learning. The method consists ill transmitting only the page or the index containing the standard filter which is the nearest to the ideal one. The advantage appears in the reduction of the bit rate which is obtained, only 10 to 15 bits per filter being transmitted instead of the 41 bits necessary in scalar quantization mode. However, this reduction in output is obtained at the expense of a very large increase in the size of memory, needed to store the dictionary, and much more computation due to the complexity of the algorithm used to search for filters in the dictionary. Unfortunately, the dictionary which is created is never universal and in fact only allows the filters which are close to the learning base to be quantized correctly. Consequently, it seems that the dictionary cannot have both a reasonable size and allow satisfactory quantization of prediction filters, resulting from speech analysis for all speakers, for all languages and for all sound recording conditions.
Finally, where standard quantizations are vectorial, they aim above all to minimize the spectral distance between the original filter and the transmitted quantified filter and it is not guaranteed that this method is the best in view of the psycho-accoustic properties of the ear which cannot be considered to be simply those of a spectrum analyser.