“Transform coding” consists in coding time-domain signals in the transform (frequency) domain. This transform notably makes it possible to use the frequency characteristics of the audio signals (music, speech, other) in order to optimize and enhance the coding efficiency. Use is, for example, made of the fact that a harmonic sound is represented in the frequency domain by a finite and reduced number of spectral rays which can thus be coded concisely. Use is also, for example, made advantageously of the frequency masking effects to format the coding noise in such a way that it is as inaudible as possible.
A standard transform coding technique is summarized as follows.
The digital audio stream (at a given sampling frequency Fs), to be coded, is cut up into frames (or, more generally “blocks”) of finite numbers of samples 2M. Each frame conventionally overlaps the preceding frame to 50%. A weighting window ha (called “analysis window”) is applied to each frame.
A transform is then applied to the signal. In the case of a transform called “MDCT” (“Modified Discrete Cosine Transform”), and in a particular embodiment, the weighted frame is “folded” according to a 2M samples to M samples transform. A DCT transform of IV type is then applied to the folded frame in order to obtain a frame of size M in the transformed domain.
The frame in the transformed domain is then quantized using a suitable quantizer. The quantization makes it possible to reduce the size of the data, but introduces a noise (audible or not) into the original frame. The higher the bit rate of the coder, the more this noise is reduced and the closer the quantized frame comes to the original frame.
Upon decoding, an inverse MDCT transform is then applied to the quantized frame. The quantized frame of size M is converted into a frame of size M in the time domain by using a DCT of inverse IV type. A second, M to 2M “unfolding” transform is then applied to the temporal frame of size M.
So-called “synthesis” weighting windows hs are then applied to the frames of sizes 2M.
The decoded audio stream is then synthesized by aggregating the overlapping parts.
For a synthesis window and a given overlap, an analysis window is determined which makes it possible to obtain a perfect reconstruction of the signal to be coded (in the absence of quantization).
A window conventionally used in transform coding is a window of sinusoidal type that is identical both in analysis and in synthesis. In this configuration, the minimum algorithmic delay introduced by the coding system is 2M/Fs seconds.
To reduce this delay, it is possible to impose zeros at the start of the synthesis window and at the end of the analysis window. Since the result of a multiplication of the signal by “0” is known in advance, it is possible to offset the frame rate relative to the position of the windows. These symmetrical windows, for example consist of:                a certain number of zeros Mz which extend over an interval corresponding to the half of the algorithmic delay that is to be saved,        a sinusoidal rising section of length M−2Mz,        a section of 2Mz values at 1,        the second half of the window finally being the symmetrical reflection of the first as illustrated in the appended FIG. 1.        
These windows have an algorithmic delay of (2M−2Mz)/Fs seconds and thus make it possible to reduce the delay by 2Mz/Fs seconds.
However, such a technique, while it makes it possible to reduce the delay, does tend, when the reduction of the delay increases to resemble a rectangular window. Such a window form is not very frequency selective and ultimately drastically lowers the audio quality of the encoded signal. In addition, it greatly constrains the window because 4Mz samples are imposed in its construction. Not many degrees of freedom are available for proposing effective windows for the coding, notably to offer a significant frequency selectivity.
The document WO-2009/081003 has proposed using asymmetrical windows to mitigate this problem. These windows, from analysis, are made up of 0s only over the end of the analysis window. In order to limit the required storage space, the synthesis window is chosen to be the temporal reversal of the analysis window. This technique notably makes it possible to reduce the encoding delay, as well as the decoding delay. For a total number of zeros Mz two times lower than that of the symmetrical windows previously described, the delay gain is the same. Given the reduced number of zeros, the frequency selectivity of such asymmetrical windows is greater than that of the symmetrical windows. The audio quality of the decoded signal is thereby enhanced.
More particularly, document WO-2009/081003 presents an analysis window ha(n) made up of two parts ha1 and ha2 from an initial window h(n) given by:
            h      ⁡              (        n        )              =          sin      ⁡              [                              π                                          2                ⁢                M                            -                              M                z                                              ⁢                      (                          n              +                              1                2                                      )                          ]              for      0    ≤    n    <                  2        ⁢        M            -              M        z            and h(n)=0 otherwise (that is to say, for 2M−Mz≦n<2M)and a correction factor Δ(n) making it possible to have the perfect reconstruction condition, given by:Δ(n)=√{square root over (h(n)h(2M−1−n)+h(n+M)h(M−1−n))}{square root over (h(n)h(2M−1−n)+h(n+M)h(M−1−n))}{square root over (h(n)h(2M−1−n)+h(n+M)h(M−1−n))}{square root over (h(n)h(2M−1−n)+h(n+M)h(M−1−n))}
The analysis window ha is given by:ha1(n+M)=h(n+M)/Δ(n) and ha2(n)=h(n)/Δ(n)for 0 ≦n<M 
The synthesis window hs(n) is the temporal reversal of the analysis window:hs(2M−1−n)=ha(n), for 0≦n<2M 
Such windows are, for one and the same delay gain, of better quality than symmetrical windows because of their better frequency selectivity.
However, even if the prior art is advantageous and proposes an enhancement of the quality compared to the preceding techniques, when a solution is sought with a more significant delay gain, with, for example, a number of zeros Mz greater than M/4 (where M is a frame duration), an audible degradation is observed, by applying such windows, which can notably be explained by the fact that a portion of the window takes high values, much greater than 1, as illustrated in FIG. 2. Now, it is generally preferable, in digital signal processing, to use weightings with values less than 1 as an absolute value because of the fixed point implementation.