Specialized transform coding produces important bit rate savings in representing digital signals such as audio. Transforms such as the Discrete Fourier Transform (DFT) and the Discrete Cosine Transform (DCT) provide a compact representation of the audio signal by condensing most of the signal energy in relatively few spectral coefficients, compared to the time-domain samples where the energy is distributed over all the samples. This energy compaction property of transforms may lead to efficient quantization, for example through adaptive bit allocation, and perceived distortion minimization, for example through the use of noise masking models. Further data reduction can be achieved through the use of overlapped transforms and Time-Domain Aliasing Cancellation (TDAC). The Modified DCT (MDCT) is an example of such overlapped transforms, in which adjacent blocks of samples of the audio signal to be processed overlap each other to avoid discontinuity artifacts while maintaining critical sampling (N samples of the input audio signal yield N transform coefficients). The TDAC property of the MDCT provides this additional advantage in energy compaction.
Recent audio coding models use a multi-mode approach. In this approach, several coding tools can be used to more efficiently encode any type of audio signal (speech, music, mixed, etc). These tools comprise transforms such as the MDCT and predictors such as pitch predictors and Linear Predictive Coding (LPC) filters used in speech coding. When operating a multi-mode codec, transitions between the different coding modes are processed carefully to avoid audible artifacts due to the transition. In particular, shaping of the quantization noise in the different coding modes is typically performed using different procedures. In the frames using transform coding, the quantization noise is shaped in the transform domain (i.e. when quantizing the transform coefficients), applying various quantization steps which are controlled by scale factors derived, for example, from the energy of the audio signal in different spectral bands. On the other hand, in the frames using a predictive model in the time-domain (which typically involves long-term predictors and short-term predictors), the quantization noise is shaped using a so-called weighting filter whose transfer function in the z-transform domain is often denoted W(z). Noise shaping is then applied by first filtering the time-domain samples of the input audio signal through the weighting filter W(z) to obtain a weighted signal, and then encoding the weighted signal in this so-called weighted domain. The spectral shape, or frequency response, of the weighting filter W(z) is controlled such that the coding (or quantization) noise is masked by the input audio signal. Typically, the weighting filter W(z) is derived from the LPC filter, which models the spectral envelope of the input audio signal.
An example of a multi-mode audio codec is the Moving Pictures Expert Group (MPEG) Unified Speech and Audio Codec (USAC). This codec integrates tools including transform coding and linear predictive coding, and can switch between different coding modes depending on the characteristics of the input audio signal. There are three (3) basic coding modes in the USAC:                1) An Advanced Audio Coding (AAC)-based coding mode, which encodes the input audio signal using the MDCT and perceptually-derived quantization of the MDCT coefficients;        2) An Algebraic Code Excited Linear Prediction (ACELP) based coding mode, which encodes the input audio signal as an excitation signal (a time-domain signal) processed through a synthesis filter; and        3) A Transform Coded eXcitation (TCX) based coding mode which is a sort of hybrid between the two previous modes, wherein the excitation of the synthesis filter of the second mode is encoded in the frequency domain; actually, this is a target signal or the weighted signal that is encoded in the transform domain.        
In the USAC, the TCX-based coding mode and the AAC-based coding mode use a similar transform, for example the MDCT. However, in their standard form, AAC and TCX do not apply the same mechanism for controlling the spectral shape of the quantization noise. AAC explicitly controls the quantization noise in the frequency domain in the quantization steps of the transform coefficients. TCX however controls the spectral shape of the quantization noise through the use of time-domain filtering, and more specifically through the use of a weighting filter W(z) as described above. To facilitate quantization noise shaping in a multi-mode audio codec, there is a need for a device and method for simultaneous time-domain and frequency-domain noise shaping for TDAC transforms.