An autoregressive all-pole model is a method that is often used for modeling of a short-term spectral envelope in speech and audio coding, where an input signal is acquired for a certain collective unit or a frame with a specified length, a parameter of the model is encoded and transmitted to a decoder together with another parameter as transmission information. The autoregressive all-pole model is generally estimated by linear prediction and represented as a linear prediction synthesis filter.
One of the latest typical speech and audio coding techniques is ITU-T Recommendation G.718. The Recommendation describes a typical frame structure for coding using a linear prediction synthesis filter, and an estimation method, a coding method, an interpolation method, and a use method of a linear prediction synthesis filter in detail. Further, speech and audio coding on the basis of linear prediction is also described in detail in Patent Literature 2.
In speech and audio coding that can handle various input/output sampling frequencies and operate at a wide range of bit rate, which vary from frame to frame, it is generally required to change the internal sampling frequency of an encoder. Because the same operation is required also in a decoder, decoding is performed at the same internal sampling frequency as in the encoder. FIG. 1 shows an example where the internal sampling frequency changes. In this example, the internal sampling frequency is 16,000 Hz in a frame i, and it is 12,800 Hz in the previous frame i-1. The linear prediction synthesis filter that represents the characteristics of an input signal in the previous frame i-1 needs to be estimated again after re-sampling the input signal at the changed internal sampling frequency of 16,000 Hz, or converted to the one corresponding to the changed internal sampling frequency of 16,000 Hz. The reason that the linear prediction synthesis filter needs to be calculated at a changed internal sampling frequency is to obtain the correct internal state of the linear prediction synthesis filter for the current input signal and to perform interpolation in order to obtain a model that is temporarily smoother.
One method for obtaining another linear prediction synthesis filter on the basis of the characteristics of a certain linear prediction synthesis filter is to calculate a linear prediction synthesis filter after conversion from a desired frequency response after conversion in a frequency domain as shown in FIG. 2. In this example, LSF coefficients are input as a parameter representing the linear prediction synthesis filter. It may be LSP coefficients, ISF coefficients, ISP coefficients or reflection coefficients, which are generally known as parameters equivalent to linear prediction coefficients. First, linear prediction coefficients are calculated in order to obtain a power spectrum Y(ω) of the linear prediction synthesis filter at the first internal sampling frequency (001). This step can be omitted when the linear prediction coefficients are known. Next, the power spectrum Y(ω) of the linear prediction synthesis filter, which is determined by the obtained linear prediction coefficients, is calculated (002). Then, the obtained power spectrum is modified to a desired power spectrum Y′(ω) (003). Autocorrelation coefficients are calculated from the modified power spectrum (004). Linear prediction coefficients are calculated from the autocorrelation coefficients (005). The relationship between the autocorrelation coefficients and the linear prediction coefficients is known as the Yule-Walker equation, and the Levinson-Durbin algorithm is well known as a solution of that equation.
This algorithm is effective in conversion of a sampling frequency of the above-described linear prediction synthesis filter. This is because, although a signal that is temporally ahead of a signal in a frame to be encoded, which is called a look-ahead signal, is generally used in linear prediction analysis, the look-ahead signal cannot be used when performing linear prediction analysis again in a decoder.
As described above, in speech and audio coding with two different internal sampling frequencies, it is preferred to use a power spectrum in order to convert the internal sampling frequency of a known linear prediction synthesis filter. However, because calculation of a power spectrum is complex computation, there is a problem that the amount of computation is large.