Many techniques exist for compressing (with loss) an audio frequency signal such as speech or music. The coding can be performed directly at the sampling frequency of the input signal, as for example in ITU-T recommendations G.711 or G.729 where the input signal is sampled at 8 kHz and the coder and decoder operate at this same frequency.
However, some coding methods use a change of sampling frequency, for example to reduce the complexity of the coding, adapt the coding according to the different frequency subbands to be coded, or convert the input signal so that it corresponds to a predefined internal sampling frequency of the coder.
In the subband coding defined in ITU-T recommendation G.722, the 16 kHz input signal is divided into two subbands (sampled at 8 kHz) which are coded separately by a coder of ADPCM (adaptive differential pulse code modulation) type. This division into two subbands is performed by a bank of quadratic mirror filters with finite impulse response (FIR), of order 23 which theoretically brings about an analysis-synthesis delay (coder+decoder) of 23 samples at 16 ms; this filter bank is employed with a polyphase implementation. The division into two subbands in G.722 makes it possible to allocate, in a predetermined manner, different bit rates to the two subbands according to their a priori perceptual importance and also to reduce the overall coding complexity by executing two coders of ADPCM type at a lower frequency. However, it induces an algorithmic delay compared to a direct ADPCM coding.
Various methods for changing sampling frequency, also called resampling, of a digital signal are known, by using, for example and in a nonexhaustive manner, an FIR (finite impulse response) filter, an IIR (infinite impulse response) filter or a polynomial interpolation (including the splines). A review of the conventional resampling methods can be found for example in the article by R. W. Schafer, L. R. Rabiner, A Digital Signal Processing Approach to Interpolation, Proceedings of the IEEE, vol. 61, No. 6, June 1973, pp. 692-702.
The advantage of the FIR (symmetrical) filter lies in its simplified implementation and—subject to certain conditions—to the possibility of ensuring a linear phase. A linear phase filtering makes it possible to preserve the waveform of the input signal, but it can be also be accompanied by a temporal spreading (ringing) that can create artifacts of pre-echo type on transients. This method brings about a delay (which is a function of the length of the impulse response), generally of the order of 1 to a few ms to ensure suitable filtering characteristics (in-band ripple, rejection level sufficient to eliminate the aliasing or spectral images, etc.).
Another alternative for resampling is to use a polynomial interpolation technique. The polynomial interpolation is above all effective for up-sampling or for down-sampling with frequencies that are close (for example from 16 kHz to 12.8 kHz).
For the cases of down-sampling with high ratio (for example from 32 kHz to 12.8 kHz), the polynomial interpolation is not the most suitable method because it does not eliminate the aliasings due to the high frequencies (in the example of down-sampling from 32 kHz to 12.8 kHz it concerns frequencies from 6.4 kHz to 16 kHz). The advantage of polynomial interpolation over the filtering techniques is the low delay, even a zero delay, and also the generally lower complexity. The use of interpolation is above all advantageous for the resampling of vectors of short length (of the order of 10 or so samples) such as, for example, a filter memory, as described later in an embodiment of the invention.
The best known and most widely used polynomial interpolation techniques are linear interpolation, parabolic interpolation, cubic interpolation in several variants, depending on the local or non-local nature of the interpolation and according to the possible constraints of continuity of the kth derivatives.
Here, the simple case of so-called Lagrange interpolation, where the parameters of a polynomial curve are identified from predefined points, is considered in more detail. It is assumed that this interpolation is repeated locally if the number of points to be interpolated is greater than the number of predefined points strictly necessary for the interpolation. In the prior art, more sophisticated techniques such as interpolation “splines” or B-splines corresponding to piecewise polynomials with constraints of continuity of the kth successive derivatives are well known; they are not reviewed here because the invention is differentiated therefrom.
FIG. 1 shows a comparison between the 1st order linear interpolation (o1, dotted line), the 2nd order parabolic interpolation (02, discontinuous line), the 3rd order cubic interpolation (03, solid line) and the 4th order interpolation (o4, chain-dotted line).
For the linear interpolation, two points determine a straight line for which the equation is vl(x)=a1*x+b1. In FIG. 1, the points at the instants x=0 and x=1, which delimit the interval [0, 1], were used. If the value of these points is v(0) and v(1) respectively, the coefficients a1 and b1 are obtained as follows:a1=v(1)−v(0)b1=v(0)The coefficients a1 and b1 of a straight line are obtained using a single addition operation and the computation of an interpolated sample vl(x) costs an addition operation and a multiplication operation, or a multiplication-addition operation (MAC).
For the parabolic interpolation, three points determine a parabola for which the equation is vp(x)=a2*x2+b2*x+c2. In FIG. 1, the points at the instants x=−1, x=0 and x=1, which delimit 2 intervals, [−1, 0] and [0, 1], were used. If the value of these points is v(−1), v(0) and v(1) respectively, the coefficients a2, b2 and c2 are obtained as follows:a2=(v(−1)+v(1))/2−v(0)b2=v(1)−v(0)−a2c2=v(0)Obtaining the coefficients a2, b2 and c2 of a parabola requires 4 addition operations and a multiplication operation or 3 addition operations and an MAC operation. The computation of an interpolated sample vp(x) costs 2 addition operations and 3 multiplication operations or one multiplication operation and 2 MAC operations.
For the cubic interpolation, four points determine a cubic curve for which the equation is vc(x)=a3*x3+b3*x2+c3*x+d3. In FIG. 1, the points at the instants x=−1, x=0, x=1 and x=2, which delimit 3 intervals, [−1, 0], [0, 1] and [1, 2], were used. If the value of these points is v(−1), v(0), v(1) and v(2) respectively, the coefficients a3, b3, c3 and d3 are obtained as follows:b3=(v(−1)+v(1))/2−v(0)a3=(v(−1)+v(2)−v(0)−v(1)−4*b3)/6c3=v(1)−v(0)−b3−a3d3=v(0)Obtaining the coefficients a3, b3, c3 and d3 of a cube requires 9 addition operations and 3 multiplication operations or 7 addition operations, 2 MAC operations and one multiplication operation. The computation of an interpolated sample vc(x) costs 3 addition operations and 6 multiplication operations or, by optimizing, 2 multiplication operations and 3 MAC operations.
For the 4th order interpolation, 5 points determine a 4th order curve for which the equation is v4(x)=a4*x4+b4*x3+c4*x2+d4*x+e4. In FIG. 1, the points at the instants x=−2, x=−1, x=0, x=1 and x=2 which delimit 4 intervals [−2, −1], [−1, 0], [0, 1] and [1, 2], were used. If the value of these points is v(−2), v(−1), v(0), v(1) and v(2) respectively, the coefficients a4, b4, c4, d4 and e4 are obtained as follows:vt1=v(−2)+v(2)−2*v(0)vt2=v(−1)+v(1)−2*v(0)vt3=v(2)−v(−2)vt4=v(1)−v(−1)a4=(vt1−4*vt2)/24b4=(vt3−2*vt4)/12c4=(16*vt2−vt1)/24d4=(8*vt4−vt3)/12e4=v(0)Obtaining the coefficients a4, b4, c4, d4 and e4 for a 4th order curve requires 10 addition operations and 10 multiplication operations or 6 addition operations, 8 MAC operations and 2 multiplication operations. Computing an interpolated sample vc(x) costs 4 addition operations and 10 multiplication operations or, by optimizing, 3 multiplication operations and 4 MAC operations.
To compute the coefficients of a curve, for example the coefficients a3, b3, c3 and d3 of a cubic curve, without loss of generality, it is recommended to consider the 4 consecutive input samples as if they were samples of index x=−1, x=0, x=1 and x=2 to simplify the computations.
When a resampling of a signal is performed, there is a desire to know the value of the signal between 2 known points of the signal to be resampled, within the interval delimited by these 2 points. For example, for up-sampling of a factor 2, it is necessary to estimate the value of the signal for x=0.5. To do this estimation, one of the values vl(0.5), vp(0.5) or vc(0.5) is simply computed.
By using the linear interpolation, the straight line is used that links the 2 known neighboring points (x=0 and x=1 to compute x=0.5, x=1 and x=2 to compute x=1.5).
In case of 2nd order interpolation, there is a choice between 2 possible parabolas because the 3 points determining the parabola delimit 2 intervals. For example, for x=0.5, it is possible to take the curve linking the points x=−1, x=0 and x=1 or the points x=0, x=1 and x=2. Experimentally, it is possible to check that the 2 solutions will be of the same quality. Advantageously, to reduce the complexity, it is possible to use a single parabola for 2 intervals; this simplification is used hereinbelow when the parabolic interpolation is discussed.
In case of 3rd order interpolation, the cubic passes through 4 input samples which delimit 3 intervals, 2 intervals at the ends and one central interval. Generally and as in the results presented in FIG. 6, the central interval [0, 1] is used to perform the interpolation from the points at the instants x=−1, 0, 1 and 2.
In case of 4th order interpolation, the curve passes through 5 input samples which delimit 4 intervals, 2 at the ends and two central ones. Experimentally, it can be shown that the use of one of the two central intervals gives the better result, and that the two central intervals give the same quality. As for the parabolic case, it is possible to proceed here also by groups of 2 input samples.
To compare the performance levels of these interpolations of the prior art, a series of sinusoids having a frequency of 200 to 6400 Hz and a pitch of 200 Hz was generated both at a sampling frequency of 12 800 Hz and of 32 000 Hz. Then, the sinusoids at 12 800 Hz were up sampled to 32 kHz and the signal-to-noise ratio (SNR) was measured for each sinusoid frequency and for each interpolation method (with delay compensation for the resampling by FIR). It is important to note here that the interpolation was implemented by shifting the instant x 0 to make it coincide with the current sampling at the input frequency; the interpolation is therefore done without delay. The samples at the edge of the input signal to be resampled, that is to say the first samples and the last samples, were disregarded. FIG. 2 summarizes the results obtained with the linear interpolation (“lin”), the parabolic or 2nd order interpolation (“o2”, by using 1 parabola for 2 intervals), the cubic or 3rd order interpolation (“o3”, by using the central interval), 4th order interpolation (“o4”, by using the 2 central intervals of a 4th order curve for 2 intervals), cubic “spline” interpolation (“spline”, by using the Matlab “spline” command) and resampling by FIR filtering (“FIR”, by using the Matlab command “s32=resample(s12, 5, 2, 30)”). The results show that the FIR filtering gives the better quasi-constant SNR for all the frequencies up to 5500 Hz at the cost of higher complexity and a consequential algorithmic delay (compensated here by using the impulse response of the FIR filter as if it were a zero-phase filter). The different interpolations have good performance levels for the low frequencies but the SNR drops rapidly with the increase in frequency. The higher the interpolation order, the better the result, but this improvement is limited for the second half of the spectrum where the difference between the 3rd order and 4th order interpolations is insignificant and nonexistent for the last quarter of the spectrum. With the cubic interpolation, the SNR is less than 30 dB for the frequencies higher than 2500 Hz, this limit is 2800 Hz for the 4th order interpolation. At the cost of higher complexity, it is the cubic “spline” interpolation which offers the best interpolation performance levels with 30 dB at 3500 Hz. Hereinafter, the FIR interpolation will be considered as reference. The SNR was also measured for a speech signal (relative to the reference signal obtained by FIR). The signal-to-noise ratios obtained are 34.7 dB with the linear interpolation, 35.5 dB with the parabolic interpolation, 38.2 dB with the cubic interpolation, 37.9 dB with the 4th order interpolation and 41.4 dB with the cubic “spline” interpolation. It can therefore be concluded that the interpolation of order higher than 3 is of little interest, this increase in order cannot be measured for the real signals. Hereinbelow, the 4th order interpolation case will not be considered.
FIG. 3 illustrates an interpolation from 12 800 Hz to 32 000 Hz on a real case. The squares represent the samples of the signal at 12 800 Hz, the triangles, the signal samples up sampled to 32 000 Hz by an FIR method which gives the reference signal which will be used as a basis hereinbelow. The dotted vertical lines give the sampling instants at 32 kHz. It will be observed that, in this example, for 2 input samples at 12.8 kHz, 5 output samples at 32 kHz are obtained, of which one is identical to one of the input samples (that still requires a copy operation). Two samples are interpolated per interval between the consecutive input samples at 12.8 kHz. It is thus possible to estimate, for 2 input samples, the computation complexity for the different interpolations, by assuming that the addition, multiplication or MAC operations all have the same weight (which is the case for most of the signal processing processors, or digital signal processors DSP):                linear interpolation: 2 straight lines, 4 interpolated samples and one copy: 7 operations, i.e. 44 800 operations per second.        Parabolic interpolation: 1 parabola, 4 interpolated samples and one copy: 17 operations, i.e. 108 800 operations per second.        Cubic interpolation: 2 cubics, 4 interpolated samples and one copy: 41 operations, i.e. 262 400 operations per second.        
These complexities can be further reduced by tabulating the values x2 and x3, that is to say by pre-computing them and by storing them in a table. This is possible because the same temporal indices are always used, for example the interpolation is done within the interval [0, 1]. For example, in the cubic interpolation and in the example of up-sampling from 12 800 Hz to 32 000 Hz, these values must be tabulated only for x=0.2, 0.4, 0.6 and 0.8. This can save one or two multiplications per interpolated sample. Thus, for the parabolic interpolation, the complexity is reduced to 13 operations, i.e. 83 200 operations per second, and for the cubic interpolation it is reduced to 33 operations, i.e. 211 200 operations per second.
In FIG. 4, the FIG. 3 has been completed to illustrate the linear interpolation. The samples of the up sampled signal (round markers) are given by the intersections of the straight lines (illustrated by solid line and by dotted line) between 2 input samples (square markers) and of the output sampling moments (dotted vertical lines). Compared to the reference signal (triangular markers), several significant deviations can be observed. It will be noted that the different straight lines used are represented alternately by a solid line or by a dotted line. In a way similar to FIG. 4, FIG. 5 illustrates the parabolic interpolation with a parabola computed for 2 intervals. The greatest error is at the instant 281.5 μs. It will be noted that the different parabolas used are represented alternately by a solid line or by a dotted line.
FIG. 6 illustrates the cubic interpolation. The interpolated samples illustrated by the round markers were obtained with the central interval. Once again, several deviations relative to the reference signal are observed. It is assumed here that the input signal is known outside of the time domain represented in the figure, so that the samples at the edges (here, the two first and the two last input samples) can be used for the interpolation. It will be noted that the different cubics used are represented alternately by a solid line or by a dotted line; it will be recalled that only the central interval is used.
It can be seen that these interpolations can be perfected. It has been shown that the increase in the order of interpolation beyond 3 is not an advantageous solution. It is known from the prior art that the interpolation “splines” can generally achieve better performance levels but at the cost of much higher complexity.
There is therefore a need to develop a more efficient interpolation solution with reduced increase in complexity.