1. Field of the Invention
The present invention relates to a system and method for producing modulated complex lapped transforms (MCLTs), and in particular, a system and method for incorporating complex coefficients to modulated lapped transforms (MLTs) to derive MCLTs.
2. Related Art
In many engineering and scientific applications, it is desirable to analyze a signal in the frequency domain or represent the signal as a linear superposition of various sinusoids. The analysis of the amplitudes and phases of such sinusoids (the signal spectrum) can be useful in multimedia applications for operations such as noise reduction, compression, and pattern recognition, among other things. The Fourier transform is a classical tool used for frequency decomposition of a signal. The Fourier transform breaks a signal down to component frequencies. However, its usefulness is limited to signals that are stationary, i.e., spectral patterns of signals that do not change appreciably with time. Since most real-world signals, such as audio and video signals, are not stationary signals, localized frequency decompositions are used, such as time-frequency transforms. These transforms provide spectral information that is localized in time.
One such transform is the discrete cosine transform (DCT). The DCT breaks a signal down to component frequencies. For instance, a block of M samples of the signal can be mapped to a block of M frequency components via a matrix of Mxc3x97M coefficients. To ensure a good energy compaction performance, the DCT approximates the eigenvectors of the autocorrelation matrix of typical signal blocks. Basis functions for the DCT (for type II) can be defined as:       a    nk    =      c    ⁢          xe2x80x83        ⁢          (      k      )        ⁢          xe2x80x83        ⁢                  2        M              ⁢          cos      ⁡              [                              (                          n              +                              1                2                                      )                    ⁢                      xe2x80x83                    ⁢                                    k              ⁢                              xe2x80x83                            ⁢              π                        M                          ]            
where, ank is the element of an A transformation matrix in the nth row and kth column, or equivalently, the nth sample of the kth basis function. For orthonormality, the scaling factors are chosen as:       c    ⁢          xe2x80x83        ⁢          (      k      )        ≡      {                                                      1              /                              2                                      ⁢                          xe2x80x83                                                                          if              ⁢                              xe2x80x83                            ⁢              k                        =            0                                                1                          otherwise                    
The transform coefficients X(k) are computed from the signal block samples x(n) by:       X    ⁢          xe2x80x83        ⁢          (      k      )        =            ∑              n        =        0                    M        -        1              ⁢          xe2x80x83        ⁢                  a        nk            ⁢      x      ⁢              xe2x80x83            ⁢              (        n        )            
The DCT can be used for convolution and correlation, because it satisfies a modified shift property. Typical uses of the DCT are in transform coding, spectral analysis, and frequency-domain adaptive filtering.
An alternative transform for spectral analysis is the discrete cosine transform, type IV (DCT-IV). The DCT-IV is obtained by shifting the frequencies of the DCT basis functions in eqn. (A) by xcfx80/2M, in the form:       a    nk    ⁢      xe2x80x83    ⁢            2      M        ⁢      cos    ⁡          [                        (                      n            +                          1              2                                )                ⁢                  xe2x80x83                ⁢                  (                      k            +                          1              2                                )                ⁢                  xe2x80x83                ⁢                  π          M                    ]      
Unlike the DCT, the scaling factor is identical for all basis functions. It should be noted that the DCT-IV basis functions have a frequency shift, when compared to the DCT basis. Nevertheless, these transforms still lead to orthogonal basis.
The DCT and DCT-IV are useful tools for frequency-domain signal decomposition. However, they suffer from blocking artifacts. In typical applications, the transform coefficients X(k) are processed in some desired way: quantization, filtering, noise reduction, etc. Reconstructed signal blocks are obtained by applying the inverse transform to such modified coefficients. When such reconstructed signal blocks are pasted together to form the reconstructed signal (e.g. a decoded audio or video signal), there will be discontinuities at the block boundaries.
The modulated lapped transform (MLT) eliminates such discontinuities. The MLT is a particular form of a cosine-modulated filter bank that allows for perfect reconstruction. For example, a signal can be recovered exactly from its MLT coefficients. Also, the MLT does not have blocking artifacts, namely, the MLT provides a reconstructed signal that decays smoothly to zero at its boundaries, avoiding discontinuities along block boundaries. In addition, the MLT has almost optimal performance for transform coding of a wide variety of signals. Because of these properties, the MLT is being used in many applications, such as many modern audio and video coding systems, including Dolby AC-3, MPEG-2 Layer III, and others.
However, one disadvantage of the MLT for some applications is that its transform coefficients are real, and so they do not explicitly carry phase information. In some multimedia applications, such as audio processing, complex subbands are typically needed by noise reduction devices, via spectral subtraction, and acoustic echo cancellation devices. Namely, in many audio processing applications digital audio representations are commonplace. For example, music compact discs (CDs), Internet audio clips, satellite television, digital video discs (DVDs), and telephony (wired or cellular) rely on digital audio techniques.
Digital representation of an audio signal is achieved by converting the analog audio signal into a digital signal with an analog-to-digital (A/D) converter. The digital representation can then be encoded, compressed, stored, transferred, utilized, etc. The digital signal can then be converted back to an analog signal with a digital-to-analog (D/A) converter, if desired. The A/D and D/A converters sample the analog signal periodically, usually at one of the following standard frequencies: 8 kHz for telephony, Internet, videoconferencing; 11.025 kHz for Internet, CD-ROMs, 16 kHz for videoconferencing, long-distance audio broadcasting, Internet, future telephony; 22.05 kHz for CD-ROMs, Internet; 32 kHz for CD-ROMs, videoconferencing, ISDN audio; 44.1 kHz for Audio CDs; and 48 kHz for Studio audio production.
Typically, if the audio signal is to be encoded or compressed after conversion, raw bits produced by the A/D are usually formatted at 16 bits per audio sample. For audio CDs, for example, the raw bit rate is 44.1 kHzxc3x9716 bits/sample=705.6 kbps (kilobits per second). For telephony, the raw rate is 8 kHzxc3x978 bits/sample=64 kbps. For audio CDs, where the storage capacity is about 700 megabytes (5,600 megabits), the raw bits can be stored, and there is no need for compression. MiniDiscs, however, can only store about 140 megabytes, and so a compression of about 4:1 is necessary to fit 30 min to 1 hour of audio in a 2.5xe2x80x3 MiniDisc.
For Internet telephony and most other applications, the raw bit rate is too high for most current channel capacities. As such, an efficient encoder/decoder (commonly referred to as coder/decoder, or codec) with good compressions is used. For example, for Internet telephony, the raw bit rate is 64 kbps, but the desired channel rate varies between 5 and 10 kbps. Therefore, a codec needs to compress the bit rate by a factor between 5 and 15, with minimum loss of perceived audio signal quality.
With the recent advances in processing chips, codecs can be implemented either in dedicated hardware, typically with programmable digital signal processor (DSP) chips, or in software in a general-purpose computer. Currently, commercial systems use many different digital audio technologies. Some examples include: ITU-T standards: G.711, G.726, G.722, G.728, G.723.1, and G.729; other telephony standards: GSM, half-rate GSM, cellular CDMA (IS-733); high-fidelity audio: Dolby AC-2 and AC-3, MPEG LII and LIII, Sony MiniDisc; Internet audio: ACELP-Net, DolbyNet, PictureTel Siren, RealAudio; and military applications: LPC-10 and USFS-1016 vocoders.
It is desirable to have codecs that can achieve low computational complexity and exhibit robustness to signal variations for allowing the codec to handle wider range of signals, i.e., the audio signals can be clean speech, noisy speech, multiple talkers, music, etc. without unduly compromising performance. Therefore what is needed is a new audio processing system that integrates an acoustic echo cancellation device and noise reducer with a codec for improving performance, reducing computational complexity, and reducing memory usage and processing delay. Whatever the merits of the above mentioned systems and methods, they do not achieve the benefits of the present invention.
To overcome the limitations in the prior art described above, and to overcome other limitations that will become apparent upon reading and understanding the present specification, the present invention is embodied in a system and method for performing spectral analysis of a digital signal having a discrete duration. The present invention performs spectral analysis by spectrally decomposing the digital signal at predefined frequencies uniformly distributed over a sampling frequency interval into complex frequency coefficients so that magnitude and phase information at each frequency is immediately available.
Namely, the system of the present invention produces a modulated complex lapped transform (MCLT) and includes real and imaginary window processors and real and imaginary transform processors. Each window processor has window functions and operators. The real window processor receives the input signal as sample blocks and applies and computes butterfly coefficients for the real part of the signal to produce resulting real vectors. The imaginary window processor receives the input signal as sample blocks and applies and computes butterfly coefficients for the imaginary part of the signal to produce resulting imaginary vectors. The real transform processor computes a spatial transform on the real vectors to produce a real transform coefficient for the MCLT. The imaginary transform processor computes a spatial transform on the imaginary vectors to produce an imaginary transform coefficient for the MCLT.
In addition, the system can include inverse transform module for inverse transformation of the encoded output. The inverse transform module can include components that are the exact inverse of the inverse real and imaginary transform processors and the real and imaginary inverse window processors. The encoded output is received and processed by inverse real and imaginary transform processors, and then received and processed by real and imaginary inverse window processors to produce an output signal that substantially matches the input signal.
The foregoing and still further features and advantages of the present invention as well as a more complete understanding thereof will be made apparent from a study of the following detailed description of the invention in connection with the accompanying drawings and appended claims.