The present invention relates to digital signal processing, and more particularly to audio frequency bandwidth expansion.
Audio signals sometimes suffer from inferior sound quality. This is because their bandwidths have been limited due to the channel/media capacity of transfer/storage systems. For example, cut-off frequencies are set at about 20 kHz for CD, 16 kHz for MP3, 15 kHz for FM radio, and even lower for other audio systems whose data rate capability are poorer. At playback time, it is beneficial to recover high frequency components that have been discarded in such systems. This processing is equivalent to expanding an audio signal bandwidth, so it can be called bandwidth expansion (BWE); see FIG. 2a. One approach to realize BWE is to first perform fast Fourier transform (FFT) on band-limited signals, shift the spectrum towards high frequencies, add the high frequency portion of the shifted spectrum to the unmodified spectrum above the cut-off frequency, and then perform inverse FFT (IFFT). The third operation can be understood as weighting the frequency-shifted spectrum with zero below the cut-off frequency and then adding it to the unmodified spectrum; see FIG. 2c. The problem with this method is that, time domain aliasing is caused due to the plain frequency domain weighting. This can lead to perceptual distortion. A possible solution that eases this problem could be to apply overlap-add methods. However, these methods are incapable of complete suppression of aliasing.
On the other hand, time domain processing for BWE has been proposed in which high frequency components are synthesized by using amplitude modulation (AM) and extracted by using a high-pass filter. This system performs the core part of high frequency synthesis in time domain and is time domain alias-free. Another property employed is to estimate the cut-off frequency of input signal, on which the modulation amount and the cut-off frequency of the high-pass filter can be determined in run-time depending on the input signal. BWE algorithms work most efficiently when the cut-off frequency is known beforehand. However, it varies depending on signal content, bit-rate, codec, and encoder used. It can vary even within a single stream along with time. Hence, a run-time cut-off frequency estimator, as shown in FIG. 2d, is desired in order for the BWE algorithms to adaptively synthesize the high frequency components that were cut-off at time-varying frequency. To estimate the cut-off frequency, one known method applies an FFT to a section of an input signal, and identifies the cut-off frequency as the highest frequency contained in the signal. Namely, it seeks the highest frequency at which the spectrum crosses a predefined threshold. This method is very simple, but a small threshold will be susceptible to noise and a large threshold will fail for small input signals. Another problem is that, even if there is no real cut-off in the input spectrum, the simple method would identify an inappropriate frequency as the cut-off frequency. Consider the case where the spectrum gradually declines toward the Nyquist frequency and the spectrum crosses the threshold at a certain frequency. Then, BWE algorithms will generate unwanted high frequencies, which could result in audible distortion, over the already existing high frequency components of the input signal.
Another bandwidth problem occurs at low frequencies: bass loudspeakers installed in electric appliances such as flat panel TV, mini-component, multimedia PC, portable media player, cell-phone, and so on cannot reproduce bass frequencies efficiently due to their limited dimensions relative to low frequency wavelengths. With such loudspeakers, the reproduction efficiency starts to degrade rapidly from about 100-300 Hz depending on the loudspeakers, and almost no sound is excited below 40-100 Hz; see FIG. 2f. To compensate for the degradation of the bass frequencies, various kinds of equalization techniques are widely used in practice. Although equalization can help reproduce the original bass sound, the amplifier gain for the bass frequencies may be excessively high. As a result, it could overdrive the loudspeaker, which may cause non-linear distortion. Also, the dynamic range of the equalized signal would become too wide for digital representation with finite word length. Another technique for bass enhancement is to invoke a perception of the bass frequencies using a psycho-acoustic effect, so-called “missing fundamental”. According to the effect, a human brain perceives the tone of the missing fundamental frequency when its higher harmonics are detected. Hence, by generating higher harmonics, one can give the perception of bass frequencies with loudspeakers that are incapable of reproducing them. The missing fundamental effect, however, gives only a “pseudo tone” of the fundamental frequency. The overuse of the effect for a wide range of frequencies leads to unnatural or unpleasant sound. As for the harmonics generation, various techniques are known in the literature: rectification, clipping, polynomials, non-linear gain, modulation, multiplicative feedback loop, and so on. In most cases, since those techniques are based on non-linear operations, an envelope estimator is desired that obtains the input signal level to generate harmonics efficiently. For example, when clipping a signal, the clipping threshold is critical to the amount of harmonics generated. Consider the case when the threshold is fixed for any input signal. Then, the amount of harmonics will be zero or insufficient for small input signal, and too much for large input signal.