Referring now to FIG. 1, a parametric coding scheme in particular a sinusoidal coder is described in PCT Patent Application No. WO01/69593. In this coder, an input audio signal x(t) is split into several (overlapping) segments or frames, typically of length 20 ms. Each segment is decomposed into transient, sinusoidal and noise components. (It is also possible to derive other components of the input audio signal such as harmonic complexes although these are not relevant for the purposes of the present invention.)
In the sinusoidal analyser 130, the signal x2 for each segment is modelled using a number of sinusoids represented by amplitude, frequency and phase parameters. This information is usually extracted for an analysis interval by performing a Fourier Transform (FT) which provides a spectral representation of the interval including: frequencies; amplitudes for each frequency; and phases for each frequency where each phase is in the range {−π,π}. Once the sinusoidal information for a segment is estimated, a tracking algorithm is initiated. This algorithm uses a cost function to link sinusoids with each other on a segment-to-segment basis to obtain so-called tracks. The tracking algorithm thus results in sinusoidal codes CS comprising sinusoidal tracks that start at a specific time instance, evolve for a certain amount of time over a plurality of time segments and then stop.
In such sinusoidal coding, frequency information is usually transmitted for the tracks formed in the encoder. This can be done cheaply, since tracks are defined as having a slowly varying frequency and, therefore, frequency can be transmitted efficiently by time-differential encoding. (In general, amplitude can also be encoded differentially over time.)
In contrast to frequency, phase transmission is viewed as expensive. In principle, if the frequency is (nearly) constant, phase as a function of the track segment index should adhere to a (nearly) linear behaviour. However, when it is transmitted, phase is limited to the range {−π,π} as provided by the Fourier Transform. Because of this modulo 2π representation of phase, the structural inter-frame relation of the phase is lost and, at first sight appears to be a white stochastic variable.
However, since the phase is the integral of the frequency, the phase need, in principle, not be transmitted. This is called phase continuation and reduces the bit rate significantly.
In phase continuation, only the frequency is transmitted and the phase is recovered at the decoder from the frequency data by exploiting the integral relation between phase and frequency. It is known, however, that the phase can only be approximately recovered using phase continuation. If frequency errors occur, due to measurement errors in the frequency or due to quantisation noise, the phase, being reconstructed using the integral relation, will typically show an error having the character of a drift. This is because frequency errors have an approximately white noise character. Integration amplifies low-frequency errors and, consequently, the recovered phase will tend to drift away from the actually measured phase. This leads to audible artifacts.
This is illustrated in FIG. 2(a) where ψ and Ω are the real frequency and phase for a track. In both the encoder and decoder frequency and phase have an integral relationship represented by I. The quantisation process in the encoder is modelled as an additive white noise n. In the decoder, the recovered phase {circumflex over (ψ)} thus includes two components: the real phase ψ and a noise component ε2, where both the spectrum of the recovered phase and the power spectral density function of the noise ε2 have a pronounced low-frequency character.
Thus, it can be seen that in phase continuation, since the recovered phase is the integral of a low-frequency signal, the recovered phase is a low-frequency signal itself. However, the noise introduced in the reconstruction process is also dominant in this low-frequency range. It is therefore difficult to separate these sources with a view to filtering the noise n introduced during encoding.
The present invention attempts to mitigate this problem.