For modulated transforms, usually prototype filters are defined which are modulated to different frequency values. Thus, there is provided a set of channels representing the signal at different “frequency” positions.
The modulations can be done by an operation of the type: hk(n)=h(n)·W(k,n), where:                n is a time index corresponding to a multiple of the sampling period,        k is an index representing a frequency channel, and        L is the length of the filter (and of the modulation).        
In addition, in the above expression:                h(n) (where 0≦n<L) defines the prototype filter which may have a complex value,        W(k,n) (where 0≦n<L) defines the modulating function for the channel k which also may have complex values,        hk(n) (where 0≦n<L) defines the modulated filter for the channel k.        
To perform a signal analysis, for encoding for example, the signal x(n) to be analyzed is projected on the modulated filter by a scalar product operation:
      y    k    =            〈              x        ,                  h          k                    〉        =                  ∑                  n          =          0                          L          -          1                    ⁢                          ⁢                        x          ⁡                      (            n            )                          ·                                            h              k                        ⁡                          (              n              )                                .                    
The analyzed signals can be the result of several projections, for example in the form: yk=<x,hk>+λ<x,h′j>, where λ, h′j and j are respectively a gain, a modulation, and a frequency index, with these latter values possibly different from hk and k.
These analysis operations can be successive over time, resulting in a series of signals yk evolving over time.
Thus one can write:
            y              k        ,        m              =                  〈                              x            m                    ,                      h            k                          〉            =                        ∑                      n            =            0                                L            -            1                          ⁢                                  ⁢                              x            ⁡                          (                              n                +                mT                            )                                ·                                    h              k                        ⁡                          (              n              )                                            ,
with m indicating an index of blocks of successive samples (a “frame”) and T defining the duration of a frame (as the number of samples).
Modulated transforms also have applications in signal synthesis. For this type of application, content will be generated in a certain number of frequency channels and these channels will be put together to reconstruct a digital signal.
A signal {circumflex over (x)}(n) is thus synthesized by projection of the transformed signals yk onto the M synthesis vectors. First an expression {tilde over (x)}(n) is defined such that:
            x      ~        ⁡          (      n      )        =            〈              y        ,                  h          k                    〉        =                            ∑                      k            =            0                                M            -            1                          ⁢                                  ⁢                                            y              k                        ·                                          h                k                            ⁡                              (                n                )                                              ⁢                                          ⁢          for          ⁢                                          ⁢          0                    ≤      n      <              L        .            
The signals yk can evolve over time, so that the synthesis will allow generating a signal of an arbitrary length:
            x      ~        ⁡          (              n        +        mT            )        =            〈                        y          m                ,                  h          k                    〉        =                            ∑                      k            =            0                                M            -            1                          ⁢                                  ⁢                                            y                              k                ,                m                                      ·                                          h                k                            ⁡                              (                n                )                                              ⁢                                          ⁢          for          ⁢                                          ⁢          0                    ≤      n      <              L        .            
The vectors defined by the expressions {tilde over (x)}, for 0≦n<L, are shifted by M samples then added together, to yield a synthesized signal {circumflex over (x)}. This is called an overlap add.
Modulated transforms advantageously have applications in signal encoding.
In frequency encoding systems, an analysis transform is performed by means of modulated analysis filters hk where:
      y          k      ,      m        =            〈                        x          m                ,                  h          k                    〉        =                  ∑                  n          =          0                          L          -          1                    ⁢                          ⁢                        x          ⁡                      (                          n              +              mT                        )                          ·                                            h              k                        ⁡                          (              n              )                                .                    
The signals yk,m carrying useful information (the usefulness can be judged for example using a perceptive distortion criterion) are then approximated and sent in encoded form.
At the decoder, the approximated components yk,m received are synthesized by reverse transformation to restore an approximation of the original samples.
The synthesis is done by way of a set of modulated synthesis filters fk:
                    x        ~            ⁡              (                  n          +          mT                )              =                  〈                              y            m                    ,                      f            k                          〉            =                        ∑                      k            =            0                                M            -            1                          ⁢                                  ⁢                              y                          k              ,              m                                ·                                    f              k                        ⁡                          (              n              )                                            ⁢        
Then the overlap add operation is performed to obtain a reconstructed and decoded signal {circumflex over (x)}.
An interesting class of modulated transforms is defined by perfect reconstruction transforms.
These transforms obtain from the decoding a decoded signal that substantially corresponds, or perfectly corresponds in the case of perfect reconstruction, to the initial signal when the transformed components yk are not modified, aside from a delay R, meaning {circumflex over (x)}(n)=x(n−R).
The reconstruction can also be an “almost perfect” reconstruction when the difference between the reconstructed signals x and {circumflex over (x)} can be considered to be negligible. For example, in audio encoding, a difference having an error magnitude 50 dB lower than the magnitude of the processed signal x can be considered to be negligible.
The most commonly used transforms are ELT (Extended Lapped Transforms), which provide a perfect reconstruction and which use a filter of length L=2·K·M. The MDCT transforms (Modified Discrete Cosine Transforms) which are MLT (Modulated Lapped Transforms) are a special case where K=1.
Quadrature Mirror Filters (QMF), or Pseudo Quadrature Mirror Filters (PQMF) are an almost perfect reconstruction solution using different modulation terms.
These different transforms can be with real or complex coefficients. They may or may not use symmetric prototype filters.
In order to satisfy the condition of perfect or almost perfect reconstruction, for any form of processed signal, the modulated analysis and synthesis filters must be linked to each other. Thus, relations link the modulation terms and the prototype filters used in the analysis and synthesis. For example, in cosine-modulated systems (MDCT, ELT, PQMF, or other systems), the modulation terms in the analysis and synthesis are linked, for example in the form: W(k,n)=W′(k,n+φ)), W and W′ indicating the modulations used in the analysis and synthesis respectively and φ indicating a phase shift term.
A commonly used special case is defined by φ=0. The modulations are then identical in analysis and synthesis.
The prototype filters for analysis and synthesis can also be linked to each other to ensure (almost) perfect reconstruction, with a constraint of the following frequently used type: h(L−1−n)=f(n), where h and f are the prototype filters used in the analysis and synthesis.
The modulations W are constrained to ensure perfect reconstruction. For example, one can generally choose for ELT transforms:
            W      ⁡              (                  k          ,          n                )              =          cos      ⁡              [                              π            M                    ⁢                      (                          n              +                              (                                                      1                    +                    M                                    2                                )                                      )                    ⁢                      (                          k              +                              1                2                                      )                          ]              ,where 0≦n<L and 0≦k<M, with L=2·K·M.
Similarly, the prototype filters are constrained to ensure perfect reconstruction, with, for example, a constraint of the type:
                    ∑                  i          =          0                                      2            ⁢            K                    -                      2            ⁢            s                    -          1                    ⁢                          ⁢                        f          ⁡                      (                          n              +              iM                        )                          ⁢                  h          ⁡                      (                          n              +              iM              +                              2                ⁢                sM                                      )                                =          δ      ⁡              (        s        )              ,          ⁢            for      ⁢                          ⁢      s        =    0    ,  1  ,  …  ⁢          ,      K    -    1.  
In particular, the prototype filters can be selected from among the following:                those defined analytically in the form of an equation and, in this class, a filter commonly used for the MDCT transform (with K=1) is expressed by:        
            h      ⁡              (        n        )              =                  f        ⁡                  (          n          )                    =              sin        ⁡                  [                                    π                              2                ⁢                M                                      ⁢                          (                              n                +                0.5                            )                                ]                      ,with 0≦n<L where L=2·M,                the filters resulting from a digital optimization according to a criterion which does not allow deducing an analytical function, such as, for example, a filter which can be obtained by minimizing an amount under the perfect reconstruction constraint (this amount can be a stop-band attenuation from a cutoff frequency, or a coding gain, or more generally any other amount judged to be relevant to the encoding quality).        
As was mentioned above, the prototype filters may or may not be symmetric. The symmetry relation is written as follows:h(L−1−n)=h(n).
The modeling of the modulated transforms implementation described above is provided for illustrative purposes. In actual implementations of these transforms, all the described calculations are not carried out in this form. For reasons of computational efficiency, time, and use of computational resources, “fast” implementations are used. These implementations do not specifically apply the calculations presented above, but these calculations are valid.
The modulated transforms as presented below are defined by fast algorithms for efficient implementation with computational resources. These algorithms are based on fast Fourier transforms or are derived from them, such as fast cosine or sine transforms (for example type IV DCT transforms).
A transform order for the fast algorithm which is less than or equal to the number M of frequency components is sufficient for the implementation of these transforms. These transformations are efficient because their complexity is proportional to log2(M) of the number M of components.
A operation reducing L samples to a number less than or equal to M components is applied prior to the fast transform.
A complete algorithm from the transformation to the analysis can combine:                multiplying samples of size L by the prototype filter,        combining the result of this multiplication, meaning a linear combination based on additions and coefficient multiplications, to allow deducing from the L weighted values a number less than or equal to M components,        a fast transform of an order less than or equal to M        
These operations are done in reverse order to perform the synthesis transformation.
FIG. 1 illustrates the analysis and synthesis as described above. The signal x is presented to an encoder COD comprising a prototype filter φ. A set of samples of size L of the signal are then multiplied by the prototype filter in the module MULT1. Then, in the module CL1, a linear combination of the samples multiplied by the prototype filter is performed in order to change from L samples to M components. Then, a fast transform TR1 is performed, before the samples are sent to a decoder DECOD.
Upon receipt of the samples, the decoder applies a fast transform TR2. Then, conversely to what was done in the encoder, a linear combination CL2 is performed to return to the initial number L of samples. These samples are then multiplied by the prototype filter of the decoder DECOD in order to reconstruct a signal {tilde over (x)}, from which the signal {circumflex over (x)} is obtained by an overlap add operation.
The coefficients of the prototype filter, of the encoder or decoder, must be stored in memory in order to perform the analysis or synthesis transform. Obviously, in a particular system using modulated transforms of different sizes, the prototype filter for each of the sizes used must be represented in memory.
In the favorable case where the filters are symmetric, only L/2 coefficients need to be stored, the L/2 other ones being determined from these stored coefficients without any arithmetic operation. Thus, for a MDCT (K=1), if a transform of size M and 2M is needed, then (M+2M)=3M coefficients must be stored if the prototypes are symmetric and (2M+4M)=6M otherwise. A typical example for audio encoding is M=320 or M=1024. Thus, for the asymmetrical case this requires storing 1920 and 6144 coefficients respectively.
Depending on the accuracy desired for the coefficient representation, 16 bits or even 24 bits are necessary for each coefficient. This implies significant memory requirements for low cost processors.
If we have a prototype filter for a transform of size UM, then it is possible to obtain coefficients for the transform of size M by decimation. Conventionally, this consists of taking a filter coefficient from U in this specific example.
However, if we only have a filter for the transform of size M, it is not as simple to extend this filter to a use with MU coefficients. A direct method of polynomial interpolation does not maintain the reconstruction accuracy at the same level as for the base transform of size M. This type of method is therefore not optimal.
When a core encoding system is implemented in an encoder, it can be useful to extend it, for example when updating a standardized version of the encoding system. For example, standardized ITU G.718 and ITU G.729.1 encoders rely on MDCT modulated transforms of respective sizes M=320 and M=160. In an extension of these standards to operate these encoders at a higher sampling frequency, referred to as a super-wideband extension, MDCTs of greater sizes are necessary. An MDCT of size M′=640 must be applied in this extension.
In an extension according to the related art, the amount of storage for expressing coefficients of a new prototype filter would have to be extended. In addition, intervention at the encoder would be necessary in order to store the coefficients.
Embodiments of the invention offer a way of saving memory, in ROM for storing the coefficients and/or in RAM for the transform calculation.