In order to more efficiently broadcast or record audio signals, the amount of information required to represent the audio signals may be reduced. In the case of digital audio signals, the amount of digital information needed to accurately reproduce the original pulse code modulation (PCM) samples may be reduced by applying a digital compression algorithm, resulting in a digitally compressed representation of the original signal. The goal of the digital compression algorithm is to produce a digital representation of an audio signal which, when decoded and reproduced, sounds the same as the original signal, while using a minimum of digital information for the compressed or encoded representation.
Recent advances in audio coding technology have led to high compression ratios while keeping audible degradation in the compressed signal to a minimum. These coders are intended for a variety of applications, including 5.1 channel film soundtracks, HDTV, laser discs and multimedia. Description of one applicable method can be found in the Advanced Television Systems Committee (ATSC) Standard document entitled “Digital Audio Compression. (AC-3) Standard”, Document A/52, 20 Dec. 1995, and the disclosure of that document is hereby expressly incorporated herein by reference.
In the basic approach, at the encoder the time domain audio signal is first converted to the frequency domain using a bank of filters. The frequency domain coefficients, thus generated, are converted to fixed point representation. In fixed point syntax, each coefficient is represented as a mantissa and an exponent. The bulk of the compressed bitstream transmitted to the decoder comprises these exponents and mantissas.
The exponents are usually transmitted in their original form. However, each mantissa must be truncated to a fixed or variable number of decimal places. The number of bits to be used for coding each mantissa is obtained from a bit allocation algorithm which may be based on the masking property of the human auditory system. Lower numbers of bits result in higher compression ratios because less space is required to transmit the coefficients. However, this may cause high quantization errors, leading to audible distortion. A good distribution of available bits to each mantissa forms the core of the advanced audio coders.
Further compression is possible by employing differential coding for the exponents. In this case the exponents for a channel are differentially coded across the frequency range. The first exponent is sent as an absolute value. Subsequent exponent information is sent in differential form, subject to a maximum limit. That is, instead of sending actual exponent values, only the difference between exponents is sent. In the extreme case, when exponent sets of several consecutive blocks in a frame are almost identical the exponent set for the first block only are sent. The subsequent blocks in the frame reuse the previously sent exponent values.
In the above mentioned AC-3 standard, the audio blocks and the fields within the blocks have variable lengths. Certain fields, such as exponents, may not be present in a particular audio block, and even if present it may require different number of bits at different times depending an the current strategy used and signal characteristics. The mantissas appear in each block, however the bit allocation for the mantissas is performed globally.
One approach could be to pack all information, excluding the mantissas, for all the audio blocks into the AC-3 frame. The remaining space in the frame is then used to allocate bits to all the mantissas globally. The mantissas for each block, quantized to appropriate bits using the bit allocation output, are then placed in the proper field in the frame. This type of approach is cumbersome and has high memory and computation requirements, and hence is not practical for a real time encoder meant for consumer application.