1. Field of the Invention
The present invention relates to a non-uniform filter bank implementation. In particular, the invention relates to a non-uniform filter bank implementation for audio compression having improved coding efficiency.
2. Description of the Related Art
Decomposition of a signal into frequency bands, for analysis or processing, has found useful application in a large number of areas such as audio, speech and video processing. Consider audio compression where the audio signal x(n) is decomposed into frequency components by a bank of K-filters (also sometimes termed as frequency transformation or sub-band coding) to yield K signal-components Xk[n], k ranging over 0 . . . K−1. The data rate at each filter output is equal to the input rate of x(n). This implies an K-fold increase in the overall data-rate. Decimation is therefore performed to bring down the data-rate. Decimation by a factor of ni implies that only one out of ni samples out of the filter is considered. If the input rate of the system is R and the decimation factor at the ith filter is ni, then the output rate from the system is
  R  ·            ∑              i        =        0                    K        -        1              ⁢          1      /                        n          i                .            Ideally, this should be equal to one (less than one implies a loss of information). When
                    ∑                  i          =          0                          K          -          1                    ⁢              1        /                  n          i                      =    1    ,the system is said to be maximally decimated. When the decimation factor is ni, the bandwidth of the corresponding filter Hi(z) must be approximately π/ni, to prevent aliasing corruption of the signal. If the decimation factor is the same for all filters, the design is simpler, and it is called a uniform-filter bank. In wavelets, the decimation factors, though not the same, vary only as powers of 2. A truly non-uniform filter bank is one in which each ni can assume any arbitrary value.
Following the decomposition and decimation, the signal is analyzed to find means of reducing the data-rate of the system.
FIG. 1 shows encoder and decoder blocks for an MPEG-I Audio Codec. An input pulse-code modulated audio signal (PCM Audio) in the encoder passes through an analysis filter bank for decimating the signal. The PCM Audio input signal is also passed to a psycho-acoustic model, which computes the masking curve. The masking information is used by the bit-allocation module to determine the number of bits to be used by the quantizer for quantizing each frequency component. The larger the number of bits used for quantization, the greater the accuracy of the representation. The trade-off is that the compression rate decreases with an increase in the number of quantization bits. Based on masking effects, the bit-allocation module computes the number of bits to be allocated to each frequency component such that quantization noise is rendered (masked) inaudible. Ancillary data are added to the quantized input signal and the composite bit stream is formatted before transmission. The ancillary data are used to transmit information about the audio signal such as sampling rate, the number of audio channels, encoder settings and all such meta-data which are necessary for correct decoding and reproduction of the audio signal at the decoder. At the receiver side, the decoder performs an inverse quantization followed by upsampling (with the same decimation factors used by the encoder) and filtering to obtain a closely matching version of the input signal x[n].
In the absence of quantization, it is desirable that the system reproduces the original signal x(n) with almost no distortion. A filter bank which achieves such a low level of distortion is called a Perfect Reconstruction Filter Bank. Compared to the non-uniform case, the theory of perfect reconstruction of a uniform filter bank is well understood and documented, and this explains why most existing codecs (MPEG-I Layer III, MPEG-2 AAC, AC-3) use uniform filter banks. The theory concerning non-uniform perfect reconstruction (PR) filter banks is still in its infancy.
It is noted that analysis by the psychoacoustics model is best when the signal decomposition is performed through a non-uniform filter bank that matches closely the critical frequency bands of the ear. Absence of a good understanding of non-uniform filter banks has in the past forced designers of codecs to settle for the less desirable uniform filter bank. However, recent developments in digital signal processing have enabled further study of this problem.
Consider a non-uniform filter-bank with decimation factors (n0, n1, . . . nK−1). It is well known that an arbitrary set will not necessarily result in a feasible PR system. For maximal-decimation, the condition
                                          ∑                          i              =              0                                      K              -              1                                ⁢                      1            /                          n              i                                      =        1                            Condition        ⁢                                  ⁢        I            must be satisfied.
Let L be the least common multiplier (1 cm) of set
      {          n      i        }        i    =    0        K    -    1  and let ki=L/ni, iε{0, 1, . . . K−1}. Perfect reconstruction is possible only if
                                          (                                          ∑                                  i                  =                  0                                                  l                  -                  1                                            ⁢                              k                i                                      )                    ≡                      0            ⁢                                          mod                ⁢                k                            l                                      ,                              for            ⁢                                                  ⁢            each            ⁢                                                  ⁢            l                    ∈                      {                          1              ,                                                …                  ⁢                                                                          ⁢                  K                                -                1                                      }                                              Condition        ⁢                                  ⁢        II            The symbol ≡ here stands for congruence (e.g., ‘11≡1 mod 5’, since 11−1 equals 10 which is divisible by 5). We shall call condition I the maximal-decimation-condition and condition II the feasibility-condition. A set of numbers satisfying both conditions is called a compatible-set.
Usually, the decimation-vector Vspec=(n0, n1, . . . nK−1)εNK, is defined by the application. Vspec may not satisfy either condition. The problem is to find a vector Vbest—match εNK such that it satisfies both conditions and is the closest match, in terms of a pre-defined measure d( ), to Vspec. That is, for all VεNK such that V satisfies the maximal-decimation and feasibility conditions, d(V, Vbest—match)≧, d(Vspec, Vbest—match).
An exhaustive brute-force search (i.e., by evaluating every possible combination of decimation factors) for Vbest—match over a set S⊂Nk can be extremely computationally expensive. An efficient algorithm is required to perform the search efficiently.