The trade-off between time and frequency in spectral domain audio coding systems has lead to several techniques to ensure high audio coding performance while minimizing audible coding errors. Such techniques include block switching and Temporal Noise Shaping (TNS) (see reference 1, below), both of which are employed in MPEG-2/4 AAC (“AAC”) (see reference 2, below). Temporal Noise Shaping (“TNS”) provides a way to use relatively long transform block lengths while ensuring that the temporal envelope of the noise is controlled to minimize audible artifacts.
A simplified block diagram of a prior art spectral-domain coding system (encoder and decoder) using TNS is shown in FIG. 1. In the encoder portion, a “Time to Frequency Transform” device or function 2 converts an incoming time-domain audio signal represented by a discrete time sequence x[n] which has been sampled from an audio source at some sampling frequency fs into the spectral domain (or “frequency” domain); in the case of AAC, a 2048 sample Modified Discrete Cosine Transform (MDCT) (see reference 3, below) is used. Prior to quantization in a quantizer or quantizing function (“Q”) 4, the encoder applies a filter or filtering function 6 (“A(z)”), whose transfer function may be represented in the Z-domain as A(z), to the spectral domain signal. The encoder sends filter parameters to the decoder as side information. The decoder portion of the coder decodes the bitstream and applies to the spectrum the inverse filter or filtering function 8 (“1/A(z)”), whose transfer function in the Z-domain may be represented as 1/A(z). A “Frequency to Time Transform” 10 device or function (a transform inverse to the Time to Frequency Transform 2) converts the spectral domain signal to a discrete time-domain signal y(n). For simplicity, FIG. 1 ignores the perceptual allocation of the quantization noise and other well-known AAC and TNS details.
The overall spectral domain output of the quantizer 4 using TNS may be expressed in the Z-transform domain in the manner of equation 1. This analysis, and other analyses below, is based on a simple additive model of quantization.
                              Y          ⁡                      (            z            )                          =                                                            A                ⁡                                  (                  z                  )                                            ⁢                              X                ⁡                                  (                  z                  )                                                                    A              ⁡                              (                z                )                                              +                                    E              ⁡                              (                z                )                                                    A              ⁡                              (                z                )                                                                        (        1        )            where E(z) is the quantization error and A(z) is transfer function of the TNS filter.
Equation 1 can be simplified to equation 2.
                              Y          ⁡                      (            z            )                          =                              X            ⁡                          (              z              )                                +                                    E              ⁡                              (                z                )                                                    A              ⁡                              (                z                )                                                                        (        2        )            
Equation 2 shows that the convolution process (multiplication by 1/A(z)) in the Z-domain is applied to the noise added during quantization of the audio spectrum. Because convolution in the spectral domain is equivalent to multiplication in the time domain, the effect of convolving the noise by 1/A(z) implies that the temporal shape of the noise has been multiplied by the temporal response of the inverse TNS filter. Hence, by selecting the filter A(z) appropriately, the quantization noise can be controlled to minimize audible artifacts generated by the low temporal resolution. TNS has been shown to significantly improve the performance of AAC, and thus is a very important tool in AAC.
However, TNS has some limitations. Namely, the encoder must transmit filter parameters to the decoder and the decoder must convolve the decoded spectrum by the inverse filter. These requirements lead to the following limitations:                1. Increased bit rate consumption in order to transmit the filter coefficients        2. The need to apply the inverse filter to the spectrum means that TNS cannot provide backward compatibility to existing systems such as AC-3 (see reference 4, below).        3. Increased decoder complexity due to the need to apply the inverse filter to the spectrum        
In accordance with aspects of the present invention, a new technique, based on noise feedback quantization (NFQ), allows the temporal envelope of the quantization noise in a spectral domain coding system to be modified while overcoming the limitation imposed by the TNS coding tool used in MPEG-2/4 AAC. According to aspects of the present invention, NFQ is employed instead of TNS in an AAC system. According to aspects of the present invention, NFQ may also be employed in other spectral domain coding systems such as an AC-3 system.