Sinusoidal Coding (SSC) is a well-known parametric coding scheme that is capable of full bandwidth high quality audio coding, see e.g. [ISO/IEC 14496-3:2001/AMD2, “Information Technology—Generic Coding of Audiovisual Objects. Part 3: Audio. Amendment 2: High Quality Parametric Audio Coding”] and [Werner Oomen, Erik Schuijers, Bert den Brinker, Jeroen Breebaart, “Advances in Parametric Coding for High-Quality Audio”, 114th AES Convention, Amsterdam, The Netherlands, Mar. 22-25 2003, preprint 5852]. Such SSC coding scheme dissects a monaural or stereo audio signal into a number of objects that each can be parameterized and efficiently encoded at a low bit-rate. These three objects are: transients (representing dynamic changes in the temporal domain), sinusoids (representing deterministic components), and noise (representing components that do not have a clear temporal or spectral localization). In case of stereo audio signals, a fourth set of parameters is relevant, namely a set of spatial image parameter that describe a relation between the two stereo channels.
Normally, at a decoder side, such parametric stereo representation of an audio signal is decoded in the spectral domain, see e.g. [Jeroen Breebaart, Steven van de Par, Armin Kohlrausch, Erik Schuijers, “High-Quality Parametric Spatial Audio Coding at Low Bitrates”, 116th AES Convention, Berlin, Germany, May 8-11 2004, preprint 6072]. Most often the spectral domain stereo representation involves computing processes such as Fast Fourier Transform (FFT) or transformation to the Quadrature Mirror Filter (QMF) domain, see e.g. [Erik Schuijers, Jeroen Breebaart, Heiko Purnhagen, Jonas Engdegård, “Low Complexity Parametric Stereo Coding”, 116th AES Convention, Berlin, Germany, May 8-11 2004, preprint 6073]. In order to reduce SSC decoder complexity, the sinusoidal components can be synthesized directly in the spectral domain. However, only sinusoidal components can be efficiently synthesized in the spectral domain. Transforming the other components to the spectral domain, i.e. transients and noise, requires a substantial computational effort.
It is also known to only transform the time signal which is the sum of the sinusoidal components to the spectral domain, and then perform the stereo decorrelation process in the spectral domain on the sinusoidal part only. The stereo spectral domain representations resulting from this process are then applied to separate synthesis filter banks for each channel to arrive at time domain stereo sinusoidal parts. Finally, the noise and transient components are added to the stereo sinusoidal parts in the time domain. However, such solution has the perceptual disadvantage that the noise and transient sounds appear to “stand out” in the sound image, and still the stereo decorrelation process in the spectral domain is a complex process that requires a substantial amount of computations.
In conclusion, known stereo decoding methods are not suited for devices where a limited signal processing capacity is available, e.g. mobile and miniature devices.