In modern audio/speech digital signal communication system, digital signal is compressed at encoder. The compressed information (bitstream) can be packetized and sent to decoder through a communication channel frame by frame. The system of encoder and decoder together is called codec. Speech/audio compression may be used to reduce the number of bits that represent the speech/audio signal thereby reducing the bit rate needed for transmission. However, speech/audio compression may result in quality degradation of decompressed signal. In general, a higher bit rate results in higher quality, while a lower bit rate causes lower quality.
In application for signal compression, some frequencies are more important than others. The important frequencies can be coded with a fine resolution. Small differences at these frequencies are significant and a coding scheme that preserves these differences must be used. On the other hand, less important frequencies do not have to be exact. A coarser coding scheme can be used, even though some of the finer details will be lost in the coding. Low frequency band is often more important than high frequency band so that low frequency band can be coded with a fine resolution which could be time domain coding approach or frequency domain coding approach. High frequency band is often less important than low frequency band so that high frequency band can be coded with a much coarser resolution which could also be time domain coding approach or frequency domain coding approach. Typical coarser coding scheme is based on a concept of BandWidth Extension (BWE) which is widely used. This technology concept sometimes is also called High Band Extension (HBE), SubBand Replica (SBR) or Spectral Band Replication (SBR). Although the name could be different, they all have the similar meaning of encoding/decoding some frequency sub-bands (usually high bands) with little budget of bit rate (even zero budget of bit rate) or significantly lower bit rate than normal encoding/decoding approach. With SBR technology, the spectral fine structure in high frequency band is copied from low frequency band and some random noise could be added; then, the spectral envelope in high frequency band is shaped by using side information transmitted from encoder to decoder; if the extended bandwidth is wide, the spectral envelope or spectral energy in high frequency band can be simply shaped by applying gains estimated from available information at decoder side.
Audio coding based on filter bank technology is widely used especially for music signals. In signal processing, a filter bank is an array of band-pass filters that separates the input signal into multiple components, each one carrying a single frequency subband of the original signal. The process of decomposition performed by the filter bank is called analysis, and the output of filter bank analysis is referred to as a subband signal with as many subbands as there are filters in the filter bank. The reconstruction process is called filter bank synthesis. In digital signal processing, the term filter bank is also commonly applied to a bank of receivers. The difference is that receivers also down-convert the subbands to a low center frequency that can be re-sampled at a reduced rate. The same result can sometimes be achieved by undersampling the bandpass subbands. The output of filter bank analysis could be in a form of complex coefficients; each complex coefficient contains real element and imaginary element respectively representing cosine term and sine term for each subband of filter bank.