Introduction
There is considerable interest among those in the field of signal processing to develop efficient means to transmit or store information. Improving coding efficiency includes (1) reducing informational requirements, that is, reducing the amount of information required to adequately represent a signal during transmission or storage, and (2) reducing processing requirements, that is, reducing the amount of processing required to implement the encoding and decoding processes.
In high-quality audio coding applications, informational requirements can sometimes be reduced without loss of perceptible audio quality by exploiting various psychoacoustic effects. Signal recording, transmitting, or reproducing techniques which divide the useful signal bandwidth into narrow bands with bandwidths approximating the human ear's critical bands can exploit psychoacoustic masking effects. Such techniques divide the signal bandwidth with an analysis filter bank, process the signal passed by each filter band, and reconstruct a replica of the original signal with a synthesis filter bank.
Two common coding techniques are subband coding and transform coding. Subband coders and transform coders can reduce the informational requirements in particular frequency bands where the noise caused by the resulting coding inaccuracy is psychoacoustically masked. Subband coders may be implemented by a bank of digital bandpass filters defining subbands of varying bandwidth. Transform coders may be implemented by any of several time-domain to frequency-domain transforms. One or more adjacent transform coefficients are grouped together to define "subbands" having effective bandwidths which are sums of individual transform coefficient bandwidths.
The mathematical basis for digital subband filter banks and digital block transforms is essentially the same. See Tribolet and Crochiere, "Frequency Domain Coding of Speech," IEEE Trans. Acoust., Speech, and Signal Proc., ASSP-27, October, 1979, pp. 512-30. Therefore, throughout the following discussion, the term "subband coder" shall refer to both a true subband coder and a transform coder. The term "subband" shall refer to portions of the useful signal bandwidth whether implemented by a true subband coder or a transform coder. The terms "transform" and "transforming" shall include digital filters and digital filtering, respectively.
In most digital coding applications, processing requirements can be reduced by increasing the efficiency of subband filtering. Improved processing efficiency permits implementation of encoders and decoders which are less expensive to build, or which impose lower signal propagation delays through an encoder/decoder system.
In many subband coder systems, the analysis and synthesis filter banks are implemented by discrete time-domain to frequency-domain transforms such as the Discrete Fourier Transform (DFT), the Discrete Cosine Transform (DCT), and the Discrete Sine Transform (DST). The number of time-domain signal samples, referred to herein as the time-domain signal sample block length, processed by such transforms is sometimes called the transform length, and the amount of processing required to perform these transforms is generally proportional to the square of the time-domain signal sample block length.
The number of frequency-domain transform coefficients generated by a transform is also sometimes called the transform length. It is common for the number of frequency-domain transform coefficients generated by the transform to be equal to the time-domain signal sample block length, but this equality is not necessary. For example, one transform referred to herein as the E-TDAC transform is sometimes described in the art as a transform of length 1/2N that transforms signal sample blocks with a length of N samples. It is possible, however, to also describe the transform as one of length N which generates only 1/2N unique frequency-domain transform coefficients. Thus, in this discussion the time-domain signal sample block length and the discrete transform length are generally assumed to be synonyms.
Various techniques have been utilized to reduce the amount of time required to perform a transform, or to reduce the processing power required to perform a transform in given amount of time, or both. One technique is taught in Narashima and Peterson, "On the Computation of the Discrete Cosine Transform," IEEE Trans. on Communications, COM-26, June, 1978, pp. 934-36. Briefly, this technique evaluates an N-point DCT by rearranging or "shuffling" the samples representing the input signal, performing an N-point DFT on the shuffled samples, and multiplying the result with a complex function. It is approximately twice as efficient as other techniques using a 2N-point FFT, however, Narashima and Peterson only teach how to improve the efficiency of filter banks implemented by one particular DCT.
Another technique which yields approximately a two-fold increase in processing efficiency concurrently performs two real-valued discrete transforms of length N with a single complex-valued FFT of length N. A subband coder utilizing this technique to concurrently perform a modified DCT with a modified DST is described in International Patent Application PCT/US 90/00501, Publication No. WO 90/09022 (published Aug. 9, 1990). The significance of these modified DCT and modified DST is discussed in Princen and Bradley, "Analysis/Synthesis Filter Barik Design Based on Time Domain Aliasing Cancellation," IEEE Trans. on Acoust., Speech, Signal Proc., ASSP-34, 1986, pp. 1153-1161. The authors describe a specific application of these transforms as the time-domain equivalent of an evenly-stacked critically-sampled single-sideband analysis-synthesis system. They are referred to collectively herein as the Evenly-stacked Time-Domain Aliasing Cancellation (E-TDAC) transform.
Another technique to reduce processing requirements is taught by Malvar, "Lapped Transforms for Efficient Transform/Subband Coding," IEEE Trans. Acoust., Speech, Signal Proc., ASSP-38, June, 1980, pp. 969-78. This technique implements an N-point modified DCT by performing a 1/2N-point DST after combining pairs of the samples representing the input signal, or "folding" the N input signal samples into a smaller set of 1/2N points. It is approximately twice as efficient as performing the modified DCT in a straight-forward manner, however, Malvar only teaches how to fold input samples for a filter bank implemented by one specific modified DCT whose input samples have been weighted by a specific sine-tapered analysis window.
The specific modified DCT implemented by Malvar is discussed in greater detail by Princen, Johnson, and Bradley, "Subband/Transform Coding Using Filter Bank Designs Based on Time Domain Aliasing Cancellation," ICASSP 1987 Conf Proc., May 1987, pp. 2161-64. The authors describe this transform as the time-domain equivalent of an oddly-stacked critically sampled single-sideband analysis-synthesis system. It is referred to herein as the Oddly-stacked Time-Domain Aliasing Cancellation (O-TDAC) transform.
It is desirable to implement encoders and decoders with the ability to use different time-domain signal sample block lengths in order to optimize coder performance. It is well known in the art that longer time-domain signal sample block lengths improve the selectivity or frequency-resolving power of subband coders, and better filter selectivity generally improves the ability of a subband coder to exploit psychoacoustic masking effects. See International Patent Application PCT/US 90/00507, Publication No. WO 90/09064 (published Aug. 9, 1990) which discusses the importance of time-domain signal sample block length and subband filter selectivity.
But longer time-domain signal sample block lengths degrade the time-resolution of a subband filter bank. Inadequate time-resolution can produce audible distortion artifacts when quantizing errors of signal events, such as transients, producing pretransient and post-transient ringing which exceed the ear's temporal psychoacoustic masking interval. See for example, Edler, "Coding of Audio Signals with Overlapping Block Transform and Adaptive Window Functions," Frequenz, vol. 43, no. 9, 1989, pp. 252-56. Hence, it is desirable that techniques which improve subband filter bank processing efficiency should also permit adaptive selection of the time-domain signal sample block length.