Audio signals, like speech or music, are encoded for example for enabling an efficient transmission or storage of the audio signals.
Audio encoders and decoders are used to represent audio based signals, such as music and background noise. These types of coders typically do not utilise a speech model for the coding process, rather they use processes for representing all types of audio signals, including speech.
Speech encoders and decoders (codecs) are usually optimised for speech signals, and can operate at either a fixed or variable bit rate.
An audio codec can also be configured to operate with varying bit rates. At lower bit rates, such an audio codec may work with speech signals at a coding rate equivalent to a pure speech codec. At higher bit rates, the audio codec may code any signal including music, background noise and speech, with higher quality and performance.
In some audio codecs the input signal is divided into a limited number of bands. Each of the band signals may be quantized. From the theory of psychoacoustics it is known that the highest frequencies in the spectrum are perceptually less important than the low frequencies. This in some audio codecs is reflected by a bit allocation where fewer bits are allocated to high frequency signals than low frequency signals.
Quantization of these signals approximates the large number of discrete values generated by the audio codec to reduce the signal bandwidth required to store or transmit the coded signal.
Typical quantization approaches used in both audio and video coding is that of vector quantization (VQ) where several samples or coefficients are grouped together in vectors and each vector is then quantized or approximated with one entry of a codebook. The entry selected to quantize the input vector is typically the nearest neighbour in the codebook according to a distance criterion. As would be understood adding more entries with the codebook would increase the bit rate and the complexity but reduce the average distortion. The codebook entries are typically referred to as codevectors.
Construction of the codebook can be made by several ways, for example a training algorithm may be made to optimize the entries according to the source distribution.
In other known examples a structured codebook can be generated. One such structured codebook approach is the lattice vector quantization. In lattice vector quantization (lattice or algebraic VQ) the codebook is formed by selecting a subset of lattice points in a given lattice.
A lattice is a linear structure in N dimensions where all points or vectors can be obtained by integer combinations of N basis vectors. In other words all points can be obtained by a weighted sum of basis vectors with signed integer weights. A mathematical expression of any lattice point in a 2-dimensional lattice structure may for example be defined by:
            y      =                                    k            1                    ⁢                      v            1                          +                              k            2                    ⁢                      v            2                                ,                  ⁢    or                                y          =                    ⁢                      [                                          y                1                            ⁢                                                          ⁢                              y                2                                      ]                                                        =                    ⁢                                    [                                                k                  1                                ⁢                                                                  ⁢                                  k                  2                                            ]                        ⁡                          [                                                                                          v                      1                                                                                                                                  v                      2                                                                                  ]                                                                    =                    ⁢                                    [                                                k                  1                                ⁢                                                                  ⁢                                  k                  2                                            ]                        ⁡                          [                                                                                          v                      11                                                                                                  v                      12                                                                                                                                  v                      21                                                                                                  v                      22                                                                                  ]                                          where the lattice point y is defined by a basis vectors v and integers k. The basis vectors may themselves be formed from the generators vij.
Typically the selected subset of lattice points rely on fixed rate or semi-variable rate coding (where the vector to be quantized is divided into sub-blocks for which the rate is variable but the overall bit rate for the global vector is fixed). An example of the semi-variable rate coding can be found in the IEEE paper “Low-complexity multi-rate lattice vector quantization with application to wideband TCX speech coding at 32 kbit/s” by Ragot et al. in Acoustics, Speech and Signal Processing, ICASSP '04 proceedings, Vol. 1 Pgs 501-504.
Furthermore variable rate encoding of the lattice codevectors has been attempted using grouping of codevectors on classes such as leader classes or shells for example as discussed in “Indexing and entropy coding of lattice codevectors” by Vasilache et al. in Acoustics, Speech and Signal Processing, ICASSP '01 proceedings, Vol 4 Pgs 2605-2608.
In some approaches variable rate encoding has been achieved by directly applying entropy encoding techniques to the lattice codevector components as discussed in “GMM-Based Entropy-Constrained Vector Quantization” by Zhao et al. in Acoustics, Speech and Signal Processing, ICASSP '07, Vol 4 Pgs 1097-1100.
Furthermore the lattice is typically not optimally organised with respect to the data used. The choice of the lattice is defined independently of any data correlation. Although there have been examples where the lattice used in lattice quantization was rotated as disclosed in “Multidimensional Rotations for Robust Quantization of Image Data”, Hung et al, IEEE Transactions on Image Processing, Volume 7, Issue 1, January 1998, Page(s): 1-12, this document describes an approach to be used on non-correlated sources and would produce non-optimal performance for Gaussian distributed sources and correlated sources.