In modern audio/speech digital signal communication systems, a digital signal is compressed at an encoder, and the compressed information (bitstream) is then packetized and sent to a decoder through a communication channel frame by frame. The system of encoder and decoder together is called CODEC. Speech and audio compression may be used to reduce the number of bits that represent the speech and audio signal, thereby reducing the bandwidth and/or bit rate needed for transmission. However, speech and audio compression may result in quality degradation of the decompressed signal. In general, a higher bit rate results in a higher quality decoded signal, while a lower bit rate results in lower quality decoded signal.
Audio coding based on filter bank technology is widely used. In this type of signal processing, the filter bank is an array of band-pass filters that separates the input signal into multiple components, where each band-pass filter carries a single frequency subband of the original signal. The process of decomposition performed by the filter bank is called analysis, and the output of filter bank analysis is referred to as a subband signal with as many subbands as there are filters in the filter bank. The reconstruction process is called filter bank synthesis. In digital signal processing, the term filter bank is also commonly applied to a bank of receivers. In some systems, receivers also down-convert the subbands to a low center frequency that can be re-sampled at a reduced rate. The same result can sometimes be achieved by undersampling the bandpass subbands. The output of filter bank analysis could be in a form of complex coefficients, where each complex coefficient contains a real element and an imaginary element respectively representing cosine term and sine term for each subband of filter bank.
In the application of filter banks for signal compression, some frequencies are perceptually more important than others from a psychoacoustic perspective. After decomposition, the important frequencies can be coded with a fine resolution. In some cases, coding schemes that preserve this fine resolution are used to maintain signal quality. On the other hand, less important frequencies can be coded with a coarser coding scheme, even though some of the finer details will be lost in the coding. A typical coarser coding scheme is based on a concept of BandWidth Extension (BWE). This technology is also referred to as High Band Extension (HBE), SubBand Replica (SBR) or Spectral Band Replication (SBR). These coding schemes encode and decode some frequency sub-bands (usually high bands) with a small bit rate budget (even a zero bit rate budget) or significantly lower bit rate than a normal encoding/decoding approach. With SBR technology, the spectral fine structure in the high frequency band is copied from low frequency band and some random noise is added. The spectral envelope in high frequency band is then shaped by using side information transmitted from encoder to decoder.
In some applications, post-processing at the decoder side is used to improve the perceptual quality of signals coded by low bit rate and SBR coding.