1. Field of the Invention
This invention relates generally to improvements in digital audio processing, and relates specifically to a system and method for implementing an efficient time-domain aliasing cancellation in digital audio encoding.
2. Description of the Background Art
Digital audio is now in widespread use in digital video disk (DVD) players, digital satellite systems (DSS), and digital television (DTV). A problem in all of these systems is the limitation of either storage capacity or bandwidth, which may be viewed as two aspects of a common problem. In order to fit more digital audio in a storage device of limited storage capacity, or to transmit digital audio over a channel of limited bandwidth, some form of digital audio compression is required. One commonly used form of compression is perceptual encoding, where models based upon human hearing allow for removing information corresponding to sounds that will not be perceived by a human.
The Advanced Television Systems Committee (ATSC) selected the Dolby(copyright) Labs design for perceptual encoding for use in the Digital Television (DTV) system (formerly known as HDTV). This design is set forth in the Audio Compression version 3 (AC-3) specification ATSC A/52 (hereinafter xe2x80x9cthe AC-3 specificationxe2x80x9d), which is hereby incorporated by reference. The AC-3 specification has been subsequently selected for Region 1 (North American market) DVD and DSS broadcast.
The AC-3 specification gives a standard decoder design for digital audio, which allows all AC-3 encoded digital audio recordings to be reproduced by differing vendors"" equipment. In contrast, the specifics of the AC-3 audio encoding process are not normative requirements of the AC-3 standard. Nevertheless, the encoder must produce a bitstream matching the syntax in the standard, which, when decoded, produces audio of sufficient quality for the intended application. Therefore, many of the encoder design details may be left to the individual designer without affecting the ability of the resulting encoded digital audio to be reproduced with the standard decoder design. It is usually more efficient to compress the audio data in the frequency domain rather than in the time domain. One way to perform the conversion from time domain to frequency domain is the modified discrete cosine transform (MDCT), which is one form of a discrete Fourier transform acting upon a function of a discrete variable. The MDCT is often used to convert input data sequences of discrete variables called time-domain data samples into output data sequences of discrete variables called frequency-domain coefficients. The time-domain data samples represent the measured values of the incoming audio data at discrete time values, and the frequency-domain coefficients represent the corresponding signal strengths at discrete frequency values.
In order to achieve high-fidelity audio when the encoded signals are later decoded during playback, the AC-3 specification adopted a method called time-domain aliasing cancellation (TDAC). The TDAC method may allow the near-perfect reconstruction of the original audio when encoded audio data is subsequently decoded for playback. The TDAC method includes two processes: a properly-chosen windowing operation using multiplication by windowing coefficients, followed by a MDCT.
An important design decision in a perceptual encoding standard is the number of digital samples transformed at a time in an MDCT, called the block-length of the MDCT. When transients (rapid fluctuations in values in a sequence of time-domain samples) are not observed, block switch flag blksw is set equal to 0, and an AC-3 encoder designed for TDAC switches to long-block MDCT calculations of 512 samples. When transients are observed, block switch flag blksw is set equal to 1, and the encoder switches to pairs of short-block MDCT calculations of 256 samples. A longer block-length increases frequency resolution but lowers time resolution. A longer block transform is usually adopted when the signal is relatively stable. A shorter block transform is adopted when the signal is relatively unstable to prevent pre-echoing effects. Therefore, rather than select a single MDCT block-length, an encoder designed for TDAC switches between MDCT block-lengths of 512 samples and 256 samples in order to maximize fidelity as audio circumstances require.
The AC-3 specification gives a basic equation for the calculation of the encoder MDCT. However, directly calculating the MDCT using the basic equation requires inordinate amounts of processor power, which prevents the implementation of an encoder with practical, cost-effective processing components. Optimizing the calculations for the MDCT for the different block-lengths is therefore an issue in the efficient design of AC-3 encoders.
The present invention includes a system and method for an efficient time-domain aliasing cancellation (TDAC) in digital audio encoding. In one embodiment, the present invention comprises an improved modified discrete cosine transform (MDCT) method for efficient perceptive encoding compression of digital audio in Dolby(copyright) Digital AC-3 format. In alternate embodiments, the improved MDCT method may be used in other perceptive encoding formats.
One embodiment of the present invention utilizes complex-valued premultiplication and complex-valued postmultiplication steps which prepare and arrange the data samples so that both the long-block and short-block transforms may be efficiently performed. The premultiplication and postmultiplication steps are carefully structured to work with discrete Fourier transforms (DFT) in a manner which will give the same numeric results as would be achieved with a direct calculation of the MDCT. However, the complex-valued premultiplication, DFT, and complex-valued postmultiplication steps together require many fewer calculation steps than the direct calculation of the MDCT. In this manner, the present invention facilitates the use of consumer-oriented digital signal processors (DSP) of reduced computational power, which in turn reduces the cost for practical implementations.