Audio coding systems are often used to reduce the amount of information required to adequately represent a source signal. By reducing information capacity requirements, a signal representation can be transmitted over channels having lower bandwidth or stored on media using less space. Perceptual audio coding can reduce the information capacity requirements of a source audio signal by eliminating either redundant components or irrelevant components in the signal. This type of coding often uses filter banks to reduce redundancy by decorrelating a source signal using a basis set of spectral components, and reduces irrelevancy by adaptive quantization of the spectral components according to psycho-perceptual criteria.
The filter banks may be implemented in many ways including a variety of transforms such as the Discrete Fourier Transform (DFT) or the Discrete Cosine Transform (DCT), for example. A set of transform coefficients or spectral components representing the spectral content of a source audio signal can be obtained by applying a transform to blocks of time-domain samples representing time intervals of the source audio signal. A particular Modified Discrete Cosine Transform (MDCT) described in Princen et al., “Subband/Transform Coding Using Filter Bank Designs Based on Time Domain Aliasing Cancellation,” Proc. of the 1987 International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 1987, pp. 2161-64, is widely used because it has several very attractive properties for audio coding including the ability to provide critical sampling while allowing adjacent source signal blocks to overlap one another. Proper operation of the MDCT filter bank requires the use of overlapped source-signal blocks and window functions that satisfy certain criteria. Two examples of coding systems that use the MDCT filter bank are those systems that conform to the Advanced Audio Coder (AAC) standard, which is described in Bosi et al., “ISO/IEC MPEG-2 Advanced Audio Coding,” J. Audio Eng. Soc., vol. 45, no. 10, October 1997, pp. 789-814, and those systems that conform to the Dolby Digital encoded bit stream standard. This coding standard, sometimes referred to as AC-3, is described in the Advanced Television Systems Committee (ATSC) A/52A document entitled “Revision A to Digital Audio Compression (AC-3) Standard” published Aug. 20, 2001. Both references are incorporated herein by reference.
A coding process that adapts the quantizing resolution can reduce signal irrelevancy but it may also introduce audible levels of quantization error or “quantization noise” into the signal. Perceptual coding systems attempt to control the quantizing resolution so that the quantization noise is “masked” or rendered imperceptible by the spectral content of the signal. These systems typically use perceptual models to predict the levels of quantization noise that can be masked by a source signal and they typically control the quantizing resolution by allocating a varying number of bits to represent each quantized spectral component so that the total bit allocation satisfies some allocation constraint.
Perceptual coding systems may be implemented in a variety of ways including special purpose hardware, digital signal processing (DSP) computers, and general purpose computers. The filter banks and the bit allocation processes used in many coding systems require significant computational resources. As a result, encoders implemented by conventional DSP and general purpose computers that are commonly available today usually cannot encode a source audio signal much faster than in “real time,” which means the time needed to encode a source audio signal is often about the same as or even greater than the time needed to present or “play” the source audio signal. Although the processing speed of DSP and general purpose computers is increasing, the demands imposed by growing complexity in the encoding processes counteracts the gains made in hardware processor speed. As a result, it is unlikely that encoders implemented by either DSP or general purpose computers will be able to encode source audio signals much faster than in real time.
One application for AC-3 coding systems is the encoding of soundtracks for motion pictures on DVDs. The length of a soundtrack for a typical motion picture is on the order of two hours. If the coding process is implemented by DSP or general purpose computers, the coding will also take approximately two hours. One way to reduce the encoding time is to execute different parts of the encoding process on different processors or computers. This approach is not attractive, however, because it requires redesigning the encoding process for operation on multiple processors, it is difficult if not impossible to design the encoding process for efficient operation on varying numbers of processors, and such a redesigned encoding process requires multiple computers even for short lengths of source signals.
What is needed is a way to use an arbitrary number of conventional audio encoding processes that can reduce encoding time.