Bitrate scalability is a useful feature for data compression coder and decoders. A scalable coder encodes a signal at a high bitrate so that subsets of this bitstream can be decoded at lower bitrates. One application of this feature is the remote browsing of data without the burden of downloading the full, high bitrate data file. Another application is for user-selectable audio quality for audio broadcasts. For the efficient use of code bits, the low bitrate streams should be used to help reconstruct the higher bitrate streams. One approach is to first encode data at a lowest supported bitrate, then encode an error between the original signal and a decoded lowest bitrate signal to form a second lowest bitrate bitstream and so on. For this scheme, difference coding, to work, the error signal must be easier to compress than the original. For this to be the case, the signal-to-noise ratio of the decoded lowest bitrate signal should be maximized.
In cases where there is a large difference between low and high bitrates in a scalable bitrate coder, more than one compression algorithm may be used to cover the different bitrates. A hybrid of compression algorithms is used to cover the full range of scalable bitrates. For the specific application of scalable bitrate audio compression, a coder optimized for low bitrate coding may be used to code the audio for the low bitrate while a high-quality, generic, audio compression algorithm is used to code the audio at the higher bitrates. Often the low bitrate coder is a speech coder. In this case, difference coding for scalable bitrates is difficult because low bitrate speech coders do not generally maximize the signal-to-noise ratio of the decoded output. Instead, many speech coders use spectral noise shaping to mask noise beneath the spectral peaks of the signal. This method is used because although the overall signal-to-noise ratio may be lower, the coding noise is less audible because of auditory masking.
Modern, high-quality, generic, audio compression algorithms take advantage of the noise masking characteristics of the human auditory system to compress audio data without causing perceptible distortions in the reconstructed audio signal. This form of compression is also known as perceptual coding. Most algorithms code a predetermined, fixed number of time-domain audio samples, a `frame` of data, at a time. Since the noise masking properties depend on frequency, the first step of a perceptual coder is to map a frame of audio data to the frequency domain. The output of this time-to-frequency mapping process is a frequency domain signal where the signal components are grouped according to subbands of frequency. A psychoacoustic model analyzes the signal to determine both the signal-dependent and signal-independent noise masking characteristics as a function of frequency. These masking characteristics are expressed as signal-to-mask ratios for each subband of frequency. A quantizer control unit may then use these ratios to determine how to quantize the signal components within each subband such that the quantization noise will be inaudible. Quantizing the signal in this manner reduces the number of bits needed to represent the audio signal without necessarily degrading the perceived audio quality of the resulting signal. Representations of the quantizer output as well as quantizer stepsizes for each subband are coded into a compressed audio data stream.
There is a need for a coder, coding system and method that provide an efficient method of compressing audio signals when a hybrid arrangement of multiple audio coding algorithms is used to compress the audio data to achieve a scalable bitrate.