One of the highest-performance encoding-decoding techniques known at present is frequency transform coding. In coders of that type, the digital signal is initially subjected to time-frequency transformation and it is the coefficients of the frequency transform which are encoded and transmitted.
For identical transmitted signal quality, the saving in transmission rate depends on the choice of transformation used for encoding, with the inverse transformation being performed on decoding. Thus, a modified discrete cosine transform (MDCT) based on time domain aliasing cancellation (TDAC, i.e. eliminating distortion in the time domain), is particularly effective compared with conventional transforms such as the discrete Fourier transform (DFT) or the discrete cosine transform (DCT). For a comparison of the relative performance of these types of processing, reference may be made to the article "Analysis/synthesis filter bank design based on time domain aliasing cancellation" published by J. P. Princen and A. B. Bradley in IEEE Transactions on ASSP, Vol. 34, pp. 1153-1161, October 1986.
When compared with encoding techniques using banks of filters, that technique of processing and encoding appears to be much more robust. Furthermore, although the above-mentioned encoding technique does not appear to be significantly better for relatively low compression ratios (e.g. for a high quality digital audio signal whose data rate is reduced to 128 kbits/s instead of a higher bit rate), use thereof would appear to be practically inevitable at much greater compression ratios. However, under such circumstances, such compression performance gives rise to increased complexity.
To enable coding algorithms that use MDCT to operate in real time, i.e. to enable corresponding dedicated coding processors to be implemented ruder satisfactory conditions of cost and reliability in operation, it is essential to establish a fast coding and calculation method for said transform, in order to significantly facilitate not only installation and implementation, but also real time operation of the abovementioned processors.
In the above-mentioned method described by Princen and Bradley, the transform coefficients are defined by: ##EQU4## with k=0 . . . , N-1. In that equation, N is the size or number of samples x(n) in the transform block, h(n) is the block weighting window obtained by space or time weighting, m is the number of the transformation block, and xm(n) indicates the n-th signal sample in the m-th block under consideration. The phase shift N.sub.o is required for perfect reconstruction of the original signal.
Because of the relationship y(k-1)=-y(N-k), the number of independent coefficients y(m,k) in the frequency domain is equal to N/2. The inverse transform MDCT.sup.-1 required for reconstructing or synthesizing the original audio signal cannot therefore deliver the original sequence. Sampling, being imperfect in the frequency domain, gives rise to distortion in the time domain known as "aliasing".
The synthesis can nevertheless be expressed with the aid of a block transform in which consecutive blocks overlap. For signal synthesis of block mo, the n-th sample of the reconstituted audio signal x.sup.r mo(n) is written: EQU x.sup.r mo(n)=f(n+N/2).Xmo-1(n+N/2)+f(n).Xmo(n)
for n=0 . . . , N/2-1.
In the above equation, f(n) is the synthesis window and Xmo(n) is the inverse transform of the transform coefficient y(mo,k), i.e.: Xmo(n)=MDCT.sup.-1 y(mo,k), i.e.:
N-1 ##EQU5## The original signal is accurately reconstructed, x.sup.r mo(n)=xmo(n) by imposing conditions on the analysis window h(n) and on the synthesis window f(n), and also on the phase shift term N.sub.o.
Thus, the aliasing terms from one block to another have the same modulus and opposite sign, and therefore cancel, providing the analysis and synthesis windows h(n) and f(n) are identical, of equal length, symmetrical about N/2 and satisfy the following equations: EQU f(n+N/2).h(n)-f(n).h(N/2+n)=0 (3) EQU f.sup.2 (n+N/2)+f(n)=2 (4)
for n=0, . . . , N/2-1 EQU no=N/4+1/2 (5)
For example, the following analysis and synthesis windows have a 50% inter-block overlap and satisfy the above conditions: EQU h(n)=f(n)=.sqroot.2.sin(.pi..(n+l/2)/N) (6)
for n=0, . . . , N-1.
The main advantage of MDCT lies in the fact that it is possible to use the above-mentioned high-performance windows in the frequency domain without reducing the bit-rate available per coefficient to be encoded. MDCP in association with the above-mentioned windows ensures proper separation of the signal components and thus provides better energy concentration. This improvement in the spectrum representation also facilitates taking account of perceptual phenomena because of the major reduction in spectrum spread, which is not the case when a rectangular window is used, for example.
The MDCT defined above in equation (1) can be calculated by means of a fast Fourier transform (FFT) of the same size, with the above-mentioned MDCT being written: ##EQU6## where Re[. . .] designates the real portion of the complex expression between brackets, i.e.: EQU y(m,k)=Re[e.sup.j2.pi.( k+l/2)no/N.DFT(xm(n).h(N-l-n) .e.sup.j.pi.n/N) ](8)
It will nevertheless be observed that direct application of above equation (8) requires the following number of calculation operations:
calculation of x(n).e.sup.j.pi.n/N, i.e. N real and imaginary products;
an FFT of length N; and
calculation of the real portion of the term-by-term product of two complex vectors of length N. For N=2P and for the most efficient calculation algorithm presently known such as that described in the article "Implementation of split Radix FFT algorithms for complex, real and real symmetric data" by Pierre Duhamel, published in IEEE on ASSP, Vol. 34, pp. 285-295, April 1986, calculating equation (8) requires 2N+N. (p-3+4)+2N multiplications and 3N(p-l)+4+N additions, i.e. a total of N(4p-l)+8 operations, ignoring the space-time weighting operations required for the windowing. These numbers which are large when the size of the transform blocks is large, e.g. N=1024 for encoding-decoding a digital audio signal, can penalize the use of MEET in an encoding-decoding method for use in real time.