In the digital coding of speech, a two-part model based on human speech production is often used, this incorporating first the formation of an excitation (in human beings: the vibration of the vocal cords or a stricture point in the vocal tract) and the shaping occurring in the vocal tract). The filtering operation that is used in a speech coder to model the shaping of the vocal tract is generally termed so-called short-term filtering or short-term modelling. For the efficient coding of an excitation signal, various methods and models have been developed, which have succeeded in lowering the bit rate required to transmit the excitation signal without, however, significantly impairing the quality of the speech signal. At present the most effective speech coding methods have proved to be speech coders that employ the analysis-by-synthesis method in searching for a representation of the excitation signal, which representation can be transmitted at the smallest possible bit rate, a notable example being the method of Code Excited Linear Prediction, see, for example U.S. Pat. No. 4,817,157. Effective methods have also been developed for coding the parameters of a short-term filtering model, such as, for example, transmission in the Line Spectrum Pair format (see the publication F. K. Soong, B. H. Juang: "Optimal quantization of LSP parameters using delayed decisions", Proceedings of the 1990 International Conference on Acoustics, Speech and Signal Processing).
Although efficient methods have been developed for transmitting both an excitation signal and a filtering model, the previously presented methods have not taken into account the fact that the shaping performed on different sounds in the vocal tract is different in type for different types of sounds and thus it can be modelled in different ways in a short-term filter. For this reason, in order to achieve speech coding that is as efficient as possible, the order of the filtering should be adapted according to the speech signal to be coded. In methods previously known in the field, fixed-order filter modelling has meant that there has been in use an order or modelling which for un-voiced sounds (consonants) is needlessly large for conveying their relatively evenly distributed spectral curve, and the resources used for this order of modelling could be better utilized in coding the excitation signal or in error correction coding. On the other hand, where voiced sounds are involved, the use of a fixed-order easily leads to the use of an excessively low-order filtering model even though the modelling of the formant structure of the spectrum of voiced sounds could be made significantly more efficient by using a larger order of modelling.