The most frequently used paradigm in speech coding is Algebraic Code Excited Linear Prediction (ACELP), which is used in standards such as the AMR-family, G.718 and MPEG USAC [1-3]. It is based on modelling speech using a source model, consisting of a linear predictor (LP) to model the spectral envelope, a long time predictor (LTP) to model the fundamental frequency and an algebraic codebook for the residual.
The coefficients of the linear predictive model are very sensitive to quantization, whereby usually, they are first transformed to Line Spectral Frequencies (LSFs) or Imittance Spectral Frequencies (ISFs) before quantization. The LSF/ISF domains are robust to quantization errors and in these domains; the stability of the predictor can be readily preserved, whereby it offers a suitable domain for quantization [4].
The LSFs/ISFs, in the following referred to as frequency values, can be obtained from a linear predictive polynomial A(z) of order m as follows. The Line Spectrum Pair polynomials are defined asP(z)=A(z)+z−m−lA(z−1)Q(z)=A(z)−z−m−lA(z−1)  (1)where I=1 for the Line Spectrum Pair and l=0 for the Imittance Spectrum Pair representation, but any I≥0 is in principle valid. In the following, it thus will be assumed only that I≥0.
Note that the original predictor can be reconstructed using A(z)=½ [P(z)+Q(z)]. The polynomials P(z) and Q(z) thus contain all the information of A(z).
The central property of LSP/ISP polynomials is that if and only if A(z) has all its roots inside the unit circle, then the roots of P(z) and Q(z) are interlaced on the unit circle. Since the roots of P(z) and Q(z) are on the unit circle, they can be represented by their angles only. These angles correspond to frequencies and since the spectra of P(z) and Q(z) have vertical lines in their logarithmic magnitude spectra at frequencies corresponding to the roots, the roots are referred to as frequency values.
It follows that the frequency values, encode all information of the predictor A(z). Moreover, it has been found that frequency values are robust to quantization errors such that a small error in one of the frequency values produces a small error in spectrum of the reconstructed predictor which is localized, in the spectrum, near the corresponding frequency. Due to these favorable properties, quantization in the LSF or ISF domains is used in all main-stream speech codecs [1-3].
One of the challenges in using frequency values is, however, finding their locations efficiently from the coefficients of the polynomials P(z) and Q(z). After all, finding the roots of polynomials is a classic and difficult problem. The previously proposed methods for this task include the following approaches:                One of the early approaches uses the fact that zeros reside on the unit circle, whereby they appear as zeros in the magnitude spectrum [5]. By taking the discrete Fourier transform of the coefficients of P(z) and Q(z), one can thus search for valleys in the magnitude spectrum. Each valley indicates the location of a root and if the spectrum is upsampled sufficiently, one can find all roots. This method however yields only an approximate position, since it is difficult to determine the exact position from the valley location.        The most frequently used approach is based on Chebyshev polynomials and was presented in [6]. It relies on the realization that the polynomials P (z) and Q(z) are symmetric and antisymmetric, respectively, whereby they contain plenty of redundant information. By removing trivial zeros at z=±1 and with the substitution x=z+z−1 (which is known as the Chebyshev transform), the polynomials can be transformed to an alternative representation FP (x) and FQ(x). These polynomials are half the order of P(z) and Q(z) and they have only real roots on the range−2 to +2. Note that the polynomials FP(x) and FQ(x) are real-valued when x is real. Moreover, since the roots are simple, FP(x) and FQ(x) will have a zero-crossing at each of their roots.        In speech codecs such as the AMR-WB, this approach is applied such that the polynomials FP(x) and FQ(x) are evaluated on a fixed grid on the real axis to find all zero-crossings. The root locations are further refined by linear interpolation around the zero-crossing. The advantage of this approach is the reduced complexity due to omission of redundant coefficients.        
While the above described methods work sufficiently in existing codecs, they do have a number of problems.