1. Field of the Invention
The present invention relates to electronic hearing devices and electronic systems for sound reproduction. More particularly the present invention relates to noise reduction to preserve the fidelity of signals in electronic hearing aid devices and other electronic sound systems. According to the present invention, the noise reduction devices and methods utilize digital signal processing techniques.
The current invention can be used in any speech communication device where speech is degraded by additive noise. Without limitation, applications of the present invention include hearing aids, telephones, assistive listening devices, and public address systems.
2. The Background Art
This invention relates generally to the field of enhancing speech degraded by additive noise as well as its application in hearing aids when only one microphone input is available for processing. The speech enhancement refers specifically to the field of improving perceptual aspects of speech, such as overall sound quality, intelligibility, and degree of listener fatigue.
Background noise is usually an unwanted signal when attempting to communicate via spoken language. Background noise can be annoying, and can even degrade speech to a point where it cannot be understood. The undesired effects of interference due to background noise are heightened in individuals with hearing loss. As is known to those skilled in the art, one of the first symptoms of a sensorineural hearing loss is increased difficulty understanding speech when background noise is present.
This problem has been investigated by estimating the Speech Reception Threshold (xe2x80x9cSRTxe2x80x9d), which is the speech-to-noise ratio required to achieve a 50% correct recognition level, usually measured using lists of single-syllable words. In most cases, hearing impaired people require a better speech-to-noise ratio in order to understand the same amount of information as people with normal hearing, depending on the nature of the background noise.
Hearing aids, which are one of the only treatments available for the loss of sensitivity associated with a sensorineural hearing loss, traditionally offer little benefit to the hearing impaired in noisy situations. However, as is known to those skilled in the art, hearing aids have been improved dramatically in the last decade, most recently with the introduction of several different kinds of digital hearing aids. These digital hearing aids employ advanced digital signal processing technologies to compensate for the hearing loss of the hearing impaired individual.
However, as is known to those skilled in the art, most digital hearing aids still do not completely solve the problem of hearing in noise. In fact, they can sometimes aggravate hearing difficulties in noisy environments. One of the benefits of modern hearing aids is the use of compression circuitry to map the range of sound associated with normal loudness into the reduced dynamic range associated with a hearing loss. The compression circuitry acts as a nonlinear amplifier and applies more gain to soft signals and less gain to loud signals so that hearing impaired individuals can hear soft sounds while keeping loud sounds from becoming too loud and causing discomfort or pain. However, one of the consequences of this compression circuitry is to reduce the signal-to-noise ratio (xe2x80x9cSNRxe2x80x9d). As more compression is applied, the signal-to-noise ratio is further degraded. In addition, amplification of soft sounds may make low-level circuit noise audible and annoying to the user.
As is known to those skilled in the art, the general field of noise reduction, i.e., the enhancement of speech degraded by additive noise, has received considerable attention in the literature since the mid-1970s. The main objective of noise reduction is ultimately to improve one or more perceptual aspects of speech, such as overall quality, intelligibility, or degree of listener fatigue.
Noise reduction techniques can be divided into two major categories, depending on the number of input signal sources. Noise reduction using multi-input signal sources requires using more than one microphone or other input transducer to obtain the reference input for speech enhancement or noise cancellation. However, use of multi-microphone systems is not always practical in hearing aids, especially small, custom devices that fit in or near the ear canal. The same is true for many other small electronic audio devices such as telephones and assistive listening devices.
Noise reduction using only one microphone is more practical for hearing aid applications. However, it is very difficult to design a noise reduction system with high performance, since the only information available to the noise reduction circuitry is the noisy speech contaminated by the additive background noise. To further aggravate the situation, the background may be itself be speech-like, such as in an environment with competing speakers (e.g., a cocktail party).
Various noise reduction schemes have been investigated, such as spectral subtraction, Wiener filtering, maximum likelihood, and minimum mean square error processing. Spectral subtraction is computationally efficient and robust as compared to other noise reduction algorithms. As is known to those skilled in the art, the fundamental idea of spectral subtraction entails subtracting an estimate of the noise power spectrum from the noisy speech power spectrum. Several publications concerning spectral subtraction techniques based on short-time spectral amplitude estimation have been reviewed and compared in Jae S. Lim and Alan V. Oppenheim, xe2x80x9cEnhancement and Bandwidth Compression of Noisy Speech,xe2x80x9d PROC. IEEE, Vol. 67, No. 12, pp. 1586-1604, December 1979.
However, as is known to those skilled in the art, there are drawbacks to these spectral subtraction methods, in that a very unpleasant residual noise remains in the processed signal (in the form of musical tones), and in that speech is perceptually distorted. Since the review of the literature mentioned above, some modified versions of spectral subtraction have been investigated in order to reduce the residual noise. This is described in SAEED V. VASEGHI, ADVANCED SIGNAL PROCESSING AND DIGITAL NOISE REDUCTION (John Wiley and Sons Ltd., 1996).
According to these modified approaches, the noisy received audio signal may be modeled in the time domain by the equation:
x(t)=s(t)+n(t),
where x(t), s(t) and n(t) are the noisy signal, the original signal, and the additive noise, respectively. In the frequency domain, the noisy signal can be expressed as:
X(ƒ)=S(ƒ)+N(ƒ),
where X(ƒ), S(ƒ), and N(ƒ) are the Fourier transforms of the noisy signal, of the original signal, and of the additive noise, respectively. Then, the equation describing spectral subtraction techniques may be generalized as:
|Ŝ(ƒ)|=|H(ƒ)|xc2x7|X(ƒ)|,
where |S{circumflex over ( )}(ƒ)| is an estimate of the original signal spectrum |S(ƒ)|, and |H(ƒ)| is a spectral gain or weighting function for adjustment of the noisy signal magnitude spectrum. As is known to those skilled in the art, the magnitude response |H(ƒ)| is defined by:
|H(ƒ)|=G(R(ƒ))=[1xe2x88x92xcexc(R(ƒ))xcex1]xcex2,
            R      ⁡              (        f        )              =                  "LeftBracketingBar"                              N            ^                    ⁡                      (            f            )                          "RightBracketingBar"                    "LeftBracketingBar"                  X          ⁡                      (            f            )                          "RightBracketingBar"              ,
where N{circumflex over ( )}(ƒ) is the estimated noise spectrum. Throughout this document, the signal-to-noise ratio (xe2x80x9cSNRxe2x80x9d) is defined as the reciprocal of R(ƒ). For magnitude spectral subtraction techniques, the exponents used in the above set of equations are xcex1=1, xcex2=1, xcexc=1, and for power spectral subtraction techniques, the exponents used are xcex1=2, xcex2=0.5, xcexc=1. The parameter xcexc controls the amount of noise subtracted from the noisy signal. For full noise subtraction, xcexc=1, and for over-subtraction, xcexc greater than 1.
The spectral subtraction technique yields an estimate only for the magnitude of the speech spectrum S(ƒ), and the phase is not processed. That is, the estimate for the spectral phase of the speech is obtained from the noisy speech, i.e., arg[S{circumflex over ( )}(ƒ)]=arg[X(ƒ)].
Due to the random variations in the noise spectrum, spectral subtraction may produce negative estimates of the power or magnitude spectrum. In addition, very small variations in SNR close to 0 dB may cause large fluctuations in the spectral subtraction amount. In fact, the residual noise introduced by the variation or erroneous estimates of the noise magnitude can become so annoying that one might prefer the unprocessed noisy speech signal over the spectrally subtracted one.
To reduce the effect of residual noise, various methods have been investigated. For example, Berouti et al. (in M. Berouti, R. Schwartz, and J. Makhoul, xe2x80x9cEnhancement of Speech Corrupted by Additive Noise,xe2x80x9d in Proc. IEEE Conf. on Acoustics, Speech and Signal Processing, pp. 208-211, April 1979) suggested the use of a xe2x80x9cnoise floorxe2x80x9d to limit the amount of reduction. Using a noise floor is equivalent to keeping the magnitude of the transfer function or gain above a certain threshold. Boll (in S. F. Boll, xe2x80x9cReduction of Acoustic Noise in Speech Using Spectral Subtraction,xe2x80x9d IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-27, pp. 113-120, April 1979) suggested magnitude averaging of the noisy speech spectrum. Soft-decision noise reduction filtering (see, e.g., R. J. McAulay and M. L. Malpass, xe2x80x9cSpeech Enhancement Using a Soft Decision Noise reduction Filter,xe2x80x9d IEEE Trans. on Acoust., Speech, Signal Proc., vol. ASSP-28, pp.137-145, April 1980) and optimal Minimum Mean-Square Error (xe2x80x9cMMSExe2x80x9d) estimation of the short-time spectral amplitude (see, e.g., Y. Ephraim and D. Malah, xe2x80x9cSpeech Enhancement Using a Minimum Mean-square Error Short-time Spectral Amplitude Estimator,xe2x80x9d IEEE Trans. on Acoust., Speech, Signal Proc., vol. ASSP-32, pp. 1109-1121, December 1984) have also been introduced for this purpose.
In 1994, Walter Etter (see Walter Etter and George S. Moschytz, xe2x80x9cNoise Reduction by Noise-Adaptive Spectral Magnitude Expansion,xe2x80x9d J. Audio Eng. Soc., Vol. 42, No. 5, May 1994) proposed a different weighting function for spectral subtraction, which is described by the following equation:
G(R(ƒ))=[A(ƒ)xc2x7R(ƒ)]1xe2x88x92"sgr"(ƒ).
The underlying idea of this technique is to adapt the crossover point of the spectral magnitude expansion in each frequency channel based on the noise and gain scale factor A(ƒ), so this method is also called noise-adaptive spectral magnitude expansion. Similarly the gain is post-processed by averaging or by using a low-pass smoothing filter to reduce the residual noise.
U.S. Pat. No. 5,794,187 (issued to D. Franklin) discloses another gain or weighting function for spectral subtraction in a broad-band time domain. In that document, the gain transfer function is modeled as:       G    =                  X                  r          ⁢                      xe2x80x83                    ⁢          m          ⁢                      xe2x80x83                    ⁢          s                                      X                      rm            ⁢                          xe2x80x83                        ⁢            s                          +        α              ,
where Xrms is the RMS value of the input noisy signal, and xcex1 is a constant.
Recently, a psychoacoustic masking model has been incorporated in spectral subtraction to reduce residual noise or distortion by finding the best tradeoff between noise reduction and speech distortion. For further information, see N. Virag. xe2x80x9cSpeech Enhancement Based on Masking Properties of the Auditory System,xe2x80x9d Proc. ICASSP, pp. 796-799, 1995, Stefan Gustafsson, Peter Jax and Peter Vary, xe2x80x9cA Novel Psychoacoustically Motivated Audio Enhancement Algorithm Preserving Background Noise Characteristics,xe2x80x9d Proc. ICASSP, pp. 397-400, 1998, and T. F. Quatieri and R. A. Baxter, xe2x80x9cNoise Reduction Based on Spectral Change,xe2x80x9d IEEE workshop on Applications of Signal Processing to Audio and Acoustics, 1997.
It is well-known that a human listener will not perceive any additive signals as long as their power spectral density lies completely below the auditory masking threshold. Therefore, complete removal of noise is not necessary in most situations. Referring to the publications mentioned above, N. Virag attempted to adjust the parameters xcex1, xcex2 and xcexc adaptively in the spectral subtraction equation so that the noise was reduced to the masking threshold. Stefan Gustafsson suggested that a perceptually complete removal of noise is neither necessary, nor desirable in most situations. In a telephone application, for example, a retained low-level natural sounding background noise will give the far end user a feeling of the atmosphere at the near end and will also avoid the impression of an interrupted transmission. Therefore, noise should only be reduced to an expected amount. In his noise-spectrum subtraction method, the weighting function is chosen in such a way that the difference between the desired and the actual noise level lies exactly at the masking threshold.
Applications of noise reduction in hearing aids have been investigated. As mentioned above, hearing aids are very sensitive to power consumption. Thus, the most challenging problem of noise reduction in hearing aids is the compromise between performance and complexity. In addition, a hearing aid inherently has its own gain adjustment function for hearing loss compensation. Cummins (in U.S. Pat. No. 4,887,299) developed a gain compensation function for both noise reduction and hearing loss compensation, which is a function of the input signal energy envelope. The gain consists of three piecewise linear sections in the decibel domain, including a first section providing expansion up to a first knee point for noise reduction, a second section providing linear amplification, and a third section providing compression to reduce the effort of over range signals and minimize loudness discomfort to the user. Finally, U.S. Pat. No. 5,867,581 discloses a hearing aid that implements noise reduction by selectively turning on or off the output signal or noisy bands.
Spectral subtraction for noise reduction is very attractive due to its simplicity, but the residual noise inherent to this technique can be unpleasant and annoying. Hence, various gain or weighting functions G(ƒ), as well as noise estimation methods in spectral subtraction have been investigated to solve this problem. It appears that the methods which combine auditory masking models have been the most successful. However, these algorithms are too complicated to be suitable for application in low-power devices, such as hearing aids. Hence, a new multi-band spectral subtraction scheme is proposed, which differs in its multi-band filter architecture, noise and signal power detection, and gain function. According to the present invention, spectral subtraction is performed in the dB domain. The circuitry and method of the present invention is relatively simple, but still maintains high sound quality.
Thus, it is an object of the present invention to provide a simple spectral subtraction noise reduction technique suitable for use in low-power applications that still maintains high sound quality. These and other features and advantages of the present invention will be presented in more detail in the following specification of the invention and the associated figures.
A multi-band spectral subtraction scheme is proposed, comprising a multi-band filter architecture, noise and signal power detection, and gain function for noise reduction. In one embodiment, the gain function for noise reduction consists of a gain scale function and a maximum attenuation function providing a predetermined amount of gain as a function of signal to noise ratio (xe2x80x9cSNRxe2x80x9d) and noise. In one embodiment, the gain scale function is a three-segment piecewise linear fuinction, and the three piecewise linear sections of the gain scale function include a first section providing maximum expansion up to a first knee point for maximum noise reduction, a second section providing less expansion up to a second knee point for less noise reduction, and a third section providing minimum or no expansion for input signals with high SNR to minimize distortion. According to embodiments of the present invention, the maximum attenuation function can either be a constant or equal to the estimated noise envelope. The disclosed noise reduction techniques can be applied to a variety of speech communication systems, such as hearing aids, public address systems, teleconference systems, voice control systems, or speaker phones. When used in hearing aid applications, the noise reduction gain function according to aspects of the present invention is combined with the hearing loss compensation gain function inherent to hearing aid processing.