The present invention relates to the field of audio coding, more specifically to the field of synthesizing an audio signal. Embodiments relate to speech coding, particularly to the speech coding technique called code excited linear predictive coding (CELP). Embodiments provide an approach for adaptive tilt compensation in shaping the codes of a CELP in an innovative or fixed codebook.
The CELP coding scheme is widely used in speech communications and is an efficient way of coding speech. CELP synthesizes an audio signal by conveying to a linear predictive filter (e.g., LPC synthesis filter 1/A(z)) the sum of two excitations. One excitation is coming from the decoded past, which is called the adaptive codebook, and the other contribution is coming from a fixed or innovative codebook which is populated by fixed codes. One problem with the CELP coding scheme is that at low bit-rates the innovative codebook is not populated enough for modeling efficiently the fine structure of speech so that the perceptual quality is degraded and the synthesized output signal sounds noisy.
For mitigating coding artifacts, different solutions were already proposed and are described in reference [1] and in reference [2]. In these references, the codes of the innovative codebook are adaptively and spectrally shaped by enhancing the spectral regions corresponding to the formants of the current frame of the audio signal. The formant positions and the shapes can be deduced directly from the LPC coefficients which are coefficients available at both the encoder and the decoder. The formant enhancement of the codes c(n) of the innovative codebook are done by a simple filtering operation:c(n)*fe(n).
In this filtering process fe(n) is the impulse response of the filter having the following transfer function:
            F      e        ⁡          (      z      )        -            A      ⁡              (                  1          ⁢                      /                    ⁢          w          ⁢                                          ⁢          1                )                    A      ⁡              (                  1          ⁢                      /                    ⁢          w          ⁢                                          ⁢          2                )            
where w1 and w2 are two weighting constants emphasizing more or less the formantic structure of the transfer function Fe(z). The resulting shaped codes of the innovative codebook inherit one characteristic of the speech signal and the synthesized signal sounds less noisy.
In the CELP coding scheme it is also usual to add a spectral tilt to the codes of the innovative code book, which is done by filtering the codes from the innovative codebook as follows:Ft(z)=1−βz−1.
The factor β is related to the voicing of the previous audio frame, and the voicing can be estimated from the energy contribution from the adaptive codebook. For example, if the previous frame is voiced, it is expected that the current frame will also be voiced and that the codes will have more energy in the low frequencies, i.e. the spectrum has a negative tilt.