A parametric coding scheme in particular a sinusoidal coder is described in PCT patent application No. WO 00/79519-A1 (Attorney Ref. N 017502) and European Patent Application No. 01201404.9, filed Apr. 18, 2001 (Attorney Ref. PHNL010252). In this coder, an audio segment or frame is modelled by a sinusoidal coder using a number of sinusoids represented by amplitude, frequency and phase parameters. Once the sinusoids for a segment are estimated, a tracking algorithm is initiated. This algorithm tries to link sinusoids with each other on a segment-to-segment basis. Sinusoidal parameters from appropriate sinusoids from consecutive segments are thus linked to obtain so-called tracks. The linking criterion is based on the frequencies of two subsequent segments, but also amplitude and/or phase information can be used. This information is combined in a cost function that determines the sinusoids to be linked. The tracking algorithm thus results in sinusoidal tracks that start at a specific time instance, evolve for a certain amount of time over a plurality of time segments and then stop.
The construction of these tracks allows for efficient coding. For example, for a sinusoidal track, only the initial phase has to be transmitted. The phases of the other sinusoids in the track are retrieved from this initial phase and the frequencies of the other sinusoids. The amplitude and frequency of a sinusoid can also be encoded differentially with respect to the previous sinusoids. Furthermore, tracks that are very short can be removed. As such, due to the tracking, the bit rate of a sinusoidal coder can be lowered considerably.
Tracking is therefore important for coding efficiency. However, it is important that correct tracks are made. If sinusoids are incorrectly linked, this can increase the bit rate unnecessarily or degrade the reconstruction quality.
It is known, however, that sinusoid frequencies within segments of lengths in the order of 10–20 ms can be non-stationary, making the sinusoidal model less adequate. Take, for example, a harmonic signal which is continually increasing in pitch. If a single sinusoid is used to estimate say the average frequency of the fundamental frequency within a segment, then when this sinusoid is subtracted from the sampled signal, it will leave a residual harmonic frequency which the sinusoidal coder will attempt to fit with a high frequency harmonic. These “ghost” harmonics may then be matched in the tracking algorithm and included in the final encoded signal which when decoded will include some distortion as well as requiring a higher bit rate than necessary to encode the signal.
In PCT Application No. WO00/74039 and R. J. Sluijter, A. J. E. Janssen, “A time warper for speech signals” IEEE Workshop on Speech Coding, Porvoo, Finland, Jun. 20–23, 1999, pp. 150–152 there is disclosed a time warper to enhance the stationarity of an audio segment.
Sluijter et al disclose a method to obtain a warp parameter a for a segment. By warping the segment with a warp function of the form:
                                          τ            ⁡                          (              t              )                                =                                                    a                T                            ⁢                              t                2                                      +                                          (                                  1                  -                  a                                )                            ⁢              t                                      ,                  0          ≤          t          ≤          T                                    Equation        ⁢                                  ⁢        1            in which T represents the duration of the segment in seconds, t represents real time and T stands for the warped time, the time warper removes the part of the frequency variation which progresses linearly with time, without changing the time duration of that segment.
By applying the time warper proposed by Sluijter et al, the problem of non-stationarity of frequencies can be alleviated, and so a sinusoidal coder can more reliably estimate the frequencies within a warped segment. Sluijter et al also discloses the transmission of the warp factor in a bit-stream so that the warp factor may be used in synthesizing warped sinusoids within a decoder.
As an example of the improvements provided by Sluijter et al, a harmonic signal is used where the fundamental frequency is changing rapidly. FIG. 4 shows the result of tracking when no warping is used at all. The lines indicate the continuation of a track, the circles represent the start or end of a track and the stars indicate single points. As can be seen from the figure, the higher frequencies (2000–6000 Hz) are for a large part missing or incorrect. As a result, incorrect tracks are made. The analysis interval has a length of 32.7 ms, with an update interval of 8 ms. (Usually a segment overlap is employed during synthesis of the encoded signal, and so where an overlap of 50% is used, there is an segment length of 16 ms.) Since the frequencies are not stationary in such a long analysis interval, the sinusoidal coder cannot estimate the higher frequencies well.
By doing the estimation on segments time-warped according to Sluijter, all frequencies are estimated correctly, as can be seen in FIG. 5. However, the figure also shows that at some instances, incorrect tracks are made.
This is because once a group of frequencies has been estimated for one segment, the tracking algorithm attempts to link these with the group of frequencies of the next segment without taking into account the frequency variation of sinusoidal components within sequential segments. So as shown in FIG. 6(a), a frequency fk is estimated for a segment k where a warping factor a1 has been determined. (In FIGS. 6(a) and 6(b) the warping factors a1,a2 are shown as the angle of the slope of the frequency, however, in practice the frequency derivative (slope) equals a/T.) At the same time frequencies fk+1(1) and fk+1(2) are estimated for a segment k+1 where a warping factor a2 has been determined. If the frequency variation is not taken into account in linking sinusoids from one segment to the next, then in the example, it is more likely that fk will be linked to fk+1(1) rather than fk+1 (2) as the difference in frequencies δ1 is less than δ2.
The present invention attempts to mitigate this problem.