I. Field of the Invention
The present invention relates to an improved electroacoustic speech processing method and apparatus with applicability in hearing aids and all types of electroacoustic assistive communication devices.
II. Background of the Invention
The most fundamental ability adversely affected by impaired hearing is the ability to communicate through speech, primarily due to an inability to sense weak sounds. The improvement of speech intelligibility for a hearing impaired listener who experiences difficulty when using a hearing prosthesis has long been recognized as a technical challenge. Many techniques have been utilized to alleviate the problem. Early methods for improving hearing were limited to systems which increased the loudness of sound and/or altered the frequency characteristics of sound in a fixed manner. In terms of electroacoustic apparatus, this means that linear and time-invariant amplifying and filtering systems were employed. Such systems increase the amplitude of sound and may introduce greater gain at some frequencies than at others. Many currently believe that hearing impairment can be substantially corrected by 1) determining the threshold of sensitivity at various frequencies, 2) comparing those threshold levels to those of normal hearing persons, and 3) supplying a hearing aid which amplifies and filters incoming sounds in a linear and time-invariant fashion such that the measured loss at those various frequencies, or some part thereof, is compensated. This type of system, however, only partially addresses the needs of hearing impaired persons. The complexities involved which limit the effectiveness of the above-described testing and fitting method derive from both the complex acoustic nature of speech and also from the physiological and psychological characteristics of hearing loss.
People who are considered to possess normal hearing have the ability to perceive speech sounds in the frequency range of about 125 Hz to about 6 KHz at 20 dB SPL or weaker and, depending upon the frequency content of the sound, can tolerate sounds as loud as 105 dB SPL. ("0 dB SPL" or "0 decibels Sound Pressure Level" is defined as 0.0002 dynes/cm.sup.2.) Note that threshold of sensitivity for pure tones in sound field is dependent upon tone frequency and is affected significantly by the age of the listener. Measured thresholds of sensitivity may also be affected by testing method. (Two references discussing normal hearing thresholds are 1) "International Organization for Standardization Recommendation R226: Normal Equal-Loudness Contours for Pure Tones and Normal Threshold of Hearing Under Free Field Listening Conditions" and 2) Sivian, L. J. and White, S. D., "On Minimum Audible Sound Fields", Journal of the Acoustical Society of America, 1933, volume 4.) Those with the best hearing are able to perceive sounds as weak as -4 dB SPL at some frequencies. Sounds with amplitude below about 20 dB SPL, therefore, may not be perceptible by many, and sounds above 105 dB SPL cause discomfort in many listeners. Sounds above 120 dB SPL can actually cause pain (see FIG. 2). At 1000 Hz, for example, the threshold of hearing sensitivity for a normal individual may be 20 dB SPL and the threshold of discomfort may be 105 dB SPL. Obviously, the thresholds of hearing sensitivity for those with impaired hearing are at greater sound pressure levels than for those with normal hearing, however, the thresholds of hearing discomfort are most often not similarly elevated. In fact, in most cases, the thresholds of hearing discomfort for those with hearing impairments are at lesser sound pressure levels than for normals.
In hearing impaired persons, elevated thresholds of sensitivity in conjunction with thresholds of discomfort that are the same as or lower than those of normals result in a reduced dynamic range of amplitudes over which the hearing impaired person can usefully perceive acoustic information. Whereas normal hearing individuals typically have a dynamic range of hearing of 85 dB, and some have a dynamic range as large as 110 dB or more, the range of usable sound amplitude is reduced in virtually all hearing impaired persons. Speech consists of louder sounds, such as vowels, and softer sounds such as the consonants .backslash.t.backslash. and .backslash.f.backslash.. For speech to be easily understandable, the softer sounds must be heard and the louder sounds must not be so loud as to either cause discomfort or to interfere with the perception of weaker following sounds, a phenomenon known as "forward masking". Thus, there is a level of speech for every individual which can be called the "Preferred Listening Level" or PLL. For normal hearing persons, the PLL is around 70 dB Sound Pressure Level, and although a hearing impaired individual may have thresholds of sensitivity which are elevated significantly as compared to normal, the PLL for mild or moderately hearing impaired persons, when not wearing a hearing aid, is similar to or only moderately elevated in comparison to that of a normal. Reduced dynamic range along with an unaided PLL near normal are primary reasons why linear amplification does not satisfy the needs of the hearing impaired.
Consider a simple linear amplification system with a gain of 30 dB fitted on an individual whose threshold of sensitivity is 50 dB SPL and whose threshold of discomfort is 95 dB SPL. In this case, sounds at 20 dB SPL, a normal threshold of sensitivity, would be amplified to 50 dB SPL by the hearing aid before presentation to this individual and would therefore be perceptible. A sound near the level of discomfort, say at 80 dB SPL, however, would be introduced to this hearing aid user at 110 dB SPL and this level is unacceptably loud. Moreover, a sound at a conversational level (Preferred Listening Level) of 70 dB SPL, for example, would be introduced to this hearing aid user at 100 dB SPL which would be perceived as being abnormally loud and is, in fact, above the threshold of discomfort. Therefore, this linear amplification system is not desirable. When fitted with a linear hearing aid, a hearing impaired user generally does not set the loudness control to a level which corrects his or her thresholds of sensitivity because, with that amount of gain, 30 dB in this example, normal conversational levels will contain sounds which are uncomfortably loud.
The intensity range or dynamic range of speech is approximately 27 dB when the talker is one meter from the listener, that is, the difference in loudness between the strongest and weakest phonemic elements of speech is about 27 dB. Due to both the absorptive dispersive nature of sound propagation in air and the presence of acoustic reflections, when the distance between the talker and listener increases, the average amplitude of speech is lowered and the dynamic range of speech may actually increase. For women talkers, the average Sound Pressure Level of speech at one meter distance is about 65 dB and for men the average is about 70 dB. Moreover, when the full array of listening situations is considered, from quiet speech to shouting, the range of average speech Sound Pressure Level is from about 40 dB to over 90 dB. It is understood that the primary purpose of a hearing aid device is to enhance the ability of the hearing impaired person to understand speech, and such a device must have utility over a broad range of listening situations.
Reduced dynamic range of hearing interferes with the perceived loudness relationships between phonemes in hearing impaired listeners. This is referred to as "loudness distortion". For example, in the word "ball", the stronger phoneme, the vowel .backslash.o.backslash. (17 in FIG. 1), may be at a level of approximately 85 dB SPL while the .backslash.b.backslash. (14 in FIG. 1), which is the weakest phoneme of the word, would be at a level of 66 dB SPL, a difference of 19 dB. Note that the normal listener has a dynamic range of about 85 dB while the hearing impaired listener, depending on the severity of the loss, may have a dynamic range of only 40 dB. Assuming that speech is amplified (linearly) such that the .backslash.b.backslash. is audible by the hearing impaired listener, the difference in level between the weakest and loudest sounds of the word "ball" spans a much greater portion of the dynamic range of the hearing impaired listener, 19 dB of a total range of 40 dB, as compared to the normal listener, 19 dB of a total range of 85 dB. Thus, the hearing impaired listener perceives a much greater difference in loudness between these two sounds because the .backslash.b.backslash. is close to his or her threshold of sensitivity, which is abnormally high in comparison to a normal hearing listener, and the .backslash.o.backslash. is close to his or her threshold of discomfort. Phonemic loudness distortion is extremely detrimental to speech intelligibility.
In an effort to eliminate the adverse effects of loud sounds being presented to a hearing aid user at levels near or greater than the threshold of discomfort, many early hearing aid designs incorporate limiting circuitry which clips any signal waveform whenever a set clipping threshold near the user's threshold of discomfort is exceeded. This technique introduces severe harmonic distortion. Although sounds of excessive loudness are prevented by waveform clipping, the resultant distortion can interfere greatly with the ability to understand speech. In addition, phonemic loudness distortion is not fully addressed because the loudness relationships amongst all phonemes that do not exceed the clipping threshold are not altered.
An alternative to waveform clipping is a technique called "compression limiting". Compression limiting also prevents loud sounds from being presented to the user, but rather than clipping the waveform, the gain of an amplifier is reduced when loud sounds are present above the compression limiting threshold, set near the user's threshold of discomfort. A benefit of compression limiting is that the high level of harmonic distortion introduced by waveform clipping is not present. However, phonemic loudness distortion is not fully addressed by this technique either, as the loudness relationships amongst all phonemes that do not exceed the compression limiting threshold are not altered. In addition, a phenomenon known as "pumping" may occur in compression limiting systems in the presence of high levels of background noise. Pumping occurs because, in the absence of speech or louder signals above the compression limiting threshold, this hearing aid will exhibit its highest gain. Thus, amplified background noise levels increase during the normal gaps in speech and is perceived as a fluttering or pumping sound.
Another aspect of hearing impairment involves abnormality in the loudness relationship between various frequencies. For example, many hearing impaired listeners have greater loss of sensitivity at high frequencies such as 4000 Hz as compared with lower frequencies such as 500 Hz, as measured with pure tone standard hearing tests (see FIG. 12). Hearing aid designers have attempted to correct for this particular and common abnormality by including a simple filter in a system with a linear amplifier such that higher frequencies are amplified more than lower frequencies. This technique is generally inadequate because the hearing impaired person's thresholds of sensitivity as a function of frequency differ from his or her frequency response (as derived from loudness balance testing) at PLL. When studies of pure tone sound pressure levels which produce equal perception of loudness for normal listeners are compared to similar studies for hearing impaired listeners, it is clear that the distortion of loudness as a function of frequency caused by hearing impairment is level dependent. Simple filtering circuitry does not compensate for this phenomenon.
Consider an individual with the following characteristics of impaired hearing: a) threshold of sensitivity at 1000 Hz is 50 dB SPL, b) threshold of sensitivity at 4000 Hz is 70 dB SPL, c) at PLL, 80 dB, the loudness balance response is reasonably flat, d) threshold of discomfort at 1000 Hz is 95 dB SPL and e) threshold of discomfort at 4000 Hz is also 95 dB SPL. In an attempt to fit this particular individual such that he exhibits normal thresholds of sensitivity, say 20 dB SPL at 1000 Hz and 15 dB SPL at 4000 Hz, he or she is fitted with a hearing aid providing 30 dB of gain at 1000 Hz and 55 dB of gain at 4000 Hz. After fitting, normal thresholds of sensitivity result. However, with incoming signals at both 1000 Hz and 4000 Hz at a level, for example, of 50 dB SPL, the hearing aid's output at 4000 Hz would be above the level of discomfort, 105 dB SPL, and the output at 1000 Hz would be only 15 dB below the level of discomfort, 80 dB SPL. Clearly, the loudness balance relation between 1000 Hz and 4000 Hz as presented to the hearing impaired listener in this situation is not normal, although his sensitivity at both 1000 Hz and 4000 Hz has been corrected. It is desirable for hearing aid systems to compensate for level-dependent frequency response distortions of this nature.
As previously mentioned, forward masking is a significant problem in speech perception of hearing impaired individuals. When a soft sound above the listener's threshold of sensitivity is presented immediately following a loud sound, the listener's perceived loudness of the soft sound is reduced. This phenomenon may be due to the hearing system's self-protecting mechanism and its inability to rapidly recover from the presence of the loud sound. (See e.g., Luscher and Zwislocki "The Decay of Sensation and the Remainder of Adaptation after Short Pure-tone Impulses on the Ear", Acta Otolaryngology, 1947, volume 35, pp. 428-445.) In normal hearing persons, this phenomenon is slight enough not to interfere with speech perception, however, the hearing impaired listener exhibits an increased difficulty in recognizing soft phonemes which are preceded by loud phonemes. This may in part be caused by the impaired ear's inability to recover from a loud sound as rapidly as a normal ear in combination with the fact that the difference in perceived loudness of a strong phoneme in comparison to a weak phoneme is greater in hearing impaired individuals due to their reduced dynamic range. Examples of soft phonemes immediately following loud phonemes abound in speech, such as the .backslash.th.backslash. in "bath" preceded by the .backslash.a.backslash.. It is desirable for hearing aid systems to be designed in a manner which lessens forward masking of soft phonemes by preceding loud phonemes.
Finally, the difficulty which many hearing impaired individuals experience understanding speech in high level noisy environments, where the noise level is greater than 65dB, should be addressed. It is well-known that noise is predominantly low-frequency energy. Some hearing aid designs therefore incorporate a user-operated switch which can be activated to attenuate low frequency energy. The user may use this switch to improve his understanding in noisy environments. It is desirable in many cases for a hearing aid system to provide automatic low-frequency cut in highly noisy environments.
U.S. Pat. No. 3,229,049 of the present inventor introduced the concept of "progressive range" or "log-linear" compression in which incoming acoustic signals covering a substantial portion of the dynamic range of normal hearing individuals are compressed such that the dynamic range of the hearing aid's acoustic output signal falls within the range between the hearing impaired person's threshold of sensitivity and threshold of discomfort. According to this invention, the gain of an amplifier reduces as the incoming signal sound pressure level increases above a compression threshold or knee. A hearing aid which fits this description amplifies incoming sounds below 35 dB SPL linearly with a gain of 30 dB and compresses incoming sounds between 35 dB and 110 dB SPL into an output range of 65 dB to 95 dB SPL. In this case, as Sound Pressure Levels rise above 35 dB, there is an average 1 dB increase of output signal amplitude for every 2.5 dB increase of input signal amplitude. This corresponds to a compression ratio of 2.5.
Some of the key shortcomings of the invention of the '049 patent result from the influence of background room noise on the perception of speech.
Background room noise may typically be at 50 dB SPL average. Using the same gain and compression described above, this noise is presented to the hearing aid user at 71 dB average. Consider the word "ball" containing the vowel .backslash.o.backslash. 17 and the consonant .backslash.b.backslash. 14, spoken at a level such that the .backslash.o.backslash. sound is at 75 dB SPL and the .backslash.b.backslash. at a level near 56 dB SPL. As processed by the log-linear compressor, the .backslash.b.backslash. would then be presented to the user at 73.4 dB. Recognizing that background noise varies in intensity and contains transient sound which exceeds its average level, and noting that the .backslash.b.backslash. is presented only 2.4 dB above the average background noise level, it is apparent that this hearing aid user will likely have difficulty in reliably perceiving the .backslash.b.backslash. and other weak phonemes in speech with such room noise present. Also, the difference in amplitude between vowels and consonants as presented to the user may be large enough that forward masking of weak consonants will occur. Additionally, in the absence of speech or louder signals, including the gaps in normal speech, this hearing aid will exhibit its highest gain, causing the previously-discussed pumping sound heard in compression limiting systems.
Killion (U.S. Pat. No. 4,170,720) discloses a log-linear compression system which operates over the range of 30 dB to 90 dB input SPL. A distinguishing feature of this system is that the amplification is linear above about 90 dB input SPL, however the difficulties associated with the '049 patent also apply to this system.
A detrimental artifact of the invention of the '049 patent, and other similar designs, involves the saturation of the first amplifying stage of the system whenever extremely high level input signals are present. This reduces the dynamic range of acoustic input signals that can be processed by the device without distortion. In an embodiment of Killion, a field effect transistor located at the input and controlled by the output signal is included to attenuate extremely high level input signals.
In co-pending application Ser. No. 07/722,344, a compression system is disclosed with the knee placed at an input sound pressure level of, for example, 55 dB SPL and with a higher compression ratio of, for example, 5. The user of such a system would be unlikely to turn up the loudness sufficiently to correct his or her threshold because usual speech would then be presented at too high a level. This system provides better speech perception in the presence of noise at the expense of not fully correcting threshold response.
Kryter (U.S. Pat. No. 3,894,195) describes a complex system in which "automatic nonlinear-linear gain control" is utilized along with a plurality of frequency filtering paths to discriminate between noise and speech and reduce the aforementioned pumping effect. In a later patent (U.S. Pat. No. 4,630,302), Kryter discloses a method and apparatus which utilizes two fast attack/slow recovery time automatic gain control sections for compressing the wide amplitude range of speech followed by a slow attack/fast recovery automatic gain control section for noise suppression.
Graupe, et al., (U.S. Pat. Nos. 4,025,721 and 4,185,168) have explored the use of adaptive filtering for removal of near-stationary noise from speech. Graupe's largely digital systems involve the continual analysis of the spectral and temporal characteristics of the incoming signal to determine whether or not a desired information-bearing signal is present.
Steeger (U.S. Pat. No. 4,508,940) presented a hearing aid with multi-channel compression which avoids some of the space and power requirements of all-digital schemes through the use of sampled-data analog techniques. Franklin (U.S. Pat. No. 4,461,025) discloses another means of background noise suppression which utilizes a slow attack fast recovery automatic gain control circuit to determine whether an incoming signal contains no speech information and reduces the gain of the hearing aid at such times; however, Franklin does not address phonemic loudness distortion in his invention. Fukuyama, et al., (U.S. Pat. No. 4,476,230) describes a hearing aid which automatically decreases the maximum output sound level when necessary to prevent sounds of uncomfortable loudness from being introduced to the user, and this system also fails to consider phonemic loudness distortion. Hotvet's automatic gain control (U.S. Pat. No. 4,718,099) provides a compressor with variable recovery time responsive to adapt to various types of signals encountered by the hearing aid user.