The present invention relates to a system and method for adaptively enhancing the frequency response of a speech signal in real-time. A speech signal received at a microphone and input to an audio application may be adversely impacted by slowly varying, or time-invariant acoustical or electrical characteristics of the acoustical environment or the electrical audio path. For example, for a hands-free telephone system in an automobile, the in-car acoustics or microphone characteristics can have a significant detrimental impact on the sound quality or intelligibility of a speech signal transmitted to a remote party.
Adjusting the spectral shape of a received speech signal can significantly improve the quality of the speech signal. For example, the spectral shape of a speech signal may be adjusted to compensate for excessive background noise. By boosting the signal in frequency ranges where speech content is prevalent while attenuating the signal in frequency ranges where background noise predominates, the overall sound quality or intelligibility of the signal can be significantly improved. In other applications it may be desirable to boost different frequency ranges and attenuate others. For example, the ideal spectral shape for a handsfree telephone system may be significantly different from the ideal spectral shape for a speech recognition system. In the first case, it is desirable to improve both sound quality and intelligibility, in the second it may be more desirable to improve the intelligibility of the speech signal with little or no regard to the actual sound quality.
FIG. 1 shows two examples of desirable frequency responses for two different applications. The first frequency response curve 10 represents a spectral shape intended to provide optimal speech quality in an environment with a high a signal-to-noise ratio (SNR). The second frequency response curve 12 shows a spectral shape intended to provide optimal speech intelligibility in a low signal-to-noise environment. FIG. 1 also shows VDA (Verband der Automobilindustrie) and ITU (International Telecommunications Union) upper and lower spectral limits 14, 16 for the frequency response in hands-free telephony systems. In some cases it may also be desirable to adjust the spectral shape of a received speech signal to conform with the VDA and ITU limits for speech frequency response.
Typically, a speech signal recorded by a microphone and input to an audio application will have an actual spectral shape significantly different from the ideal spectral shape for the application. Accordingly, adjusting the spectrum of the speech signal to more closely conform to the ideal spectral shape is desirable. A system and method for performing such an adjustment, or normalization, must be capable of taking into account the acoustic transfer function characteristics of the environment in which the speech signal is recorded, and the frequency response of the electrical audio path. Furthermore, such a system and method must also take into account acoustic and electrical changes that may occur in the systems.