The present invention relates to the field of audio processing and more specifically to systems, devices, and methods to compensate for noise in a listener's environment.
There are many systems for the application of filtering to noise suppression of an audio signal. In most cases, these inventions relate to the removal of noise that is present in the audio signal from the origin or introduced into the signal through processing and transmission. Various forms of filtering may be applied which suppresses the noise signal in whole or in part. Generally, these systems have adverse impacts upon the quality of the original signal. However, these systems do not address noise in the environment of the listener, which cannot be filtered.
Conversely, systems for the suppression of noise in the listener's environment also exist. These systems generally use noise cancellation to remove the disrupting external signal by adding sound projected through headphones which has the effect of countering the sound waves produced by the noise. In this case, the noise is completely canceled and listener is generally unaware of the existence of the external noise, a result which can reduce the awareness of the listener to potential dangers in the environment.
In some prior art systems, dynamic volume compensation may be used to raise the volume of a source signal of interest over ambient background noise. However, these systems increase the gain in a spectrally uniform manner, raising the volume of all frequency components equally. This effect can distort the perception of music and speech due to the non-linear behavior of the human ear with respect to frequency and volume.
Microphones and mechanical systems (e.g., computer software) can measure dBSPL; a sound (e.g., 40 dBSPL) at particular frequency (e.g., 1 kHz) sounds just as loud as the sound (e.g., 40 dBSPL) at a different frequency (e.g., 4 kHz) to a microphone or mechanical system. However, our hearing can be affected by the mechanical construction of our outer ear and/or slow variation in sensitivity across the basilar membrane due to fluid damping of the incident waves in the cochlear fluid. The variable sensitivity of human hearing is reflected in the Fletcher-Munson equal loudness contours and the equal-loudness contours from ISO 226:3003 revision (Phons). The equations of the systems of the present invention utilize conversions from dBSPL to Phons and from Phons to dBSPL (incoming sounds levels are converted from dBSPL to Phons for use in the equations, then subsequently the Phons are converted to dBSPL for expression to speakers and headphones. Conversion from dBSPL to Phons and from Phons to dBSPL is in accordance with the Fletcher-Munson equal-loudness contours and the equal-loudness contours from ISO 226:3003 revision.
Since the human ear dynamically adjusts to sound intensity levels, the presence of background noise alters the threshold at which sounds begin to be perceived. As a result, ambient noise at a given frequency may make sounds at those frequencies that would otherwise be perceptible imperceptible in the presence of ambient noise. In order for the sound to be heard it must be amplified over the background noise. The volume of the ambient noise therefore represents a degree of hearing impairment or baseline threshold elevation over which the sound must be amplified to be perceived.
This effect varies according to the spectral composition of the noise, that is, spectral components that are sufficiently far from the spectral composition of the noise will remain perceptible. Consequently, using the total intensity of the background noise to raise the intensity of the source uniformly will overly amplify bands which are not affected, possibly raising the volume to damaging levels. In order to amplify only those components which need compensation, the gains to the source signal must vary by spectral band, according to the spectral composition of the noise.
Moreover, due to the nonlinear response of the human ear, using the spectral intensity of the background noise at a particular band as the gain for the source at that band will produce excessive amplification. In order to compute the correct gain a nonlinear psychoacoustic model must be used to compute an appropriate gain for each frequency. The intensity of the background noise as well as the source signal at a given frequency are inputs to this model, and the output is a desired gain for the source signal at that frequency.
The perceived loudness of sound can be modeled by a harmonic oscillator with spring constant that varies according to the mean power of vibration. This model is called the Earspring model. The Earspring equation 10,11 is written [d2/dt2+2β(d/dt)+k(1+γγ<y2>)]y(t)=ηF(t), where t=time, y(t)=amplitude of vibration, F(t)=driving force in terms of Phons amplitude or sound intensity, <y2>=mean power of vibration or S(P), identified as the Sones power in the following equations, β=damping constant, k=spring constant, γ=coefficient of power dependence of spring constant, and η=the scale factor. Thus the resonant frequency of the Earspring varies with the amplitude of the <y2> term. Since <y2> is a function of y, this equation is nonlinear.
For a particular driving force F(t), which we will consider to be a sinusoid at a given frequency and amplitude, for which the amplitude is P in dBPhons, a solution to the earspring equation y(t) can be found, for a particular set of boundary conditions, which is the steady state response to the forcing function. Transforming into the frequency domain we can obtain the mean power of vibration <y2>=½|Y|2, which can be rewritten as S(P).
A possible set of boundary parameters comprises (i) a Sones ratio between the hearing threshold level and the reference level of 40 dBSPL at 1 kHz, and (ii) the detuning (flattening) of tones by about 75-80 cents as tones range in intensity from ˜40 dBSPL to ˜90 dBSPL near 1 kHz; This equation can be solved for numerically though in practice it is more efficient to use a computational model to estimate the solution.
We can derive an equation for S(P),
      S    ⁡          (      P      )        =                    (                              4            ⁢                                          β                ^                            2                                +                      Γ            40            2                          )                    (                              4            ⁢                                          β                ^                            2                                +                                    (                                                Γ                  40                                ⁢                                  S                  ⁡                                      (                    P                    )                                                              )                        2                          )              ⁢          10                        (                      P            -            40                    )                /        10            where S(P) is the perceived loudness in Sones which the listener experiences, and P is the sound intensity impinging on the ear in dBPhons. Where {circumflex over (β)}=β/√{square root over (k)} and where Γ40 represents the mean power of vibration at the resonant frequency for a ˜40 dBSPL amplitude driving force. The constants {circumflex over (β)} and Γ40 are found by solving the Earspring equation according to experimental boundary conditions. Note that this formula is independent of frequency although the sound intensity P of any given signal may vary as a function of frequency.
The present invention features systems for dynamically adjusting audio signals by applying a gain to the signal in a spectrally varying manner to compensate for a noise source in the environment of the listener, such that the sound is perceived by the listener to be unchanged in loudness and spectral composition. The system obtains a threshold elevation for each frequency component by analyzing the spectral composition of the ambient noise. This threshold elevation is then used by a psychoacoustic model of hearing to determine an appropriate gain adjustment for the corresponding frequency component of the source signal which will make that component perceived by the human ear to be just as loud as if the noise were not present. After applying the gains to the source signal, the resulting signal is output to the speaker. The system allows a listener to hear without distortion, over ambient noise, by applying a gain to the source that varies according to the spectral composition of the noise, rather than cancelling the noise, or applying a uniform volume adjustment to the source. The perceived spectral composition of the source is thus adjusted without the removal of the noise signal. Systems may be incorporated into apparatuses including but not limited to mobile phones and music players.
Any feature or combination of features described herein are included within the scope of the present invention provided that the features included in any such combination are not mutually inconsistent as will be apparent from the context, this specification, and the knowledge of one of ordinary skill in the art. Additional advantages and aspects of the present invention are apparent in the following detailed description.