The present invention is directed to the processing of acoustic signals, and more particularly, but not exclusively, relates to the localization and extraction of acoustic signals emanating from different sources.
The difficulty of extracting a desired signal in the presence of interfering signals is a long-standing problem confronted by acoustic engineers. This problem impacts the design and construction of many kinds of devices such as systems for voice recognition and intelligence gathering. Especially troublesome is the separation of desired sound from unwanted sound with hearing aid devices. Generally, hearing aid devices do not permit selective amplification of a desired sound when contaminated by noise from a nearby source—particularly when the noise is more intense. This problem is even more severe when the desired sound is a speech signal and the nearby noise is also a speech signal produced by multiple talkers (e.g. babble). As used herein, “noise” refers not only to random or nondeterministic signals, but also to undesired signals and signals interfering with the perception of a desired signal.
One attempted solution to this problem has been the application of a single, highly directional microphone to enhance directionality of the hearing aid receiver. This approach has only a very limited capability. As a result, spectral subtraction, comb filtering, and speech-production modeling have been explored to enhance single microphone performance. Nonetheless, these approaches still generally fail to improve intelligibility of a desired speech signal, particularly when the signal and noise sources are in close proximity.
Another approach has been to arrange a number of microphones in a selected spatial relationship to form a type of directional detection beam. Unfortunately, when limited to a size practical for hearing aids, beam forming arrays also have limited capacity to separate signals that are close together—especially if the noise is more intense than the desired speech signal. In addition, in the case of one noise source in a less reverberant environment, the noise cancellation provided by the beam-former varies with the location of the noise source in relation to the microphone array. R. W. Stadler and W. M. Rabinowitz, On the Potential of Fixed Arrays for Hearing Aids, 94 Journal Acoustical Society of America 1332 (September 1993), and W. Soede et al., Development of a Directional Hearing Instrument Based on Array Technology, 94 Journal of Acoustical Society of America 785 (August 1993) are cited as additional background concerning the beam forming approach.
Still another approach has been the application of two microphones displaced from one another to provide two signals to emulate certain aspects of the binaural hearing system common to humans and many types of animals. Although certain aspects of biologic binaural hearing are not fully understood, it is believed that the ability to localize sound sources is based on evaluation by the auditory system of binaural time delays and sound levels across different frequency bands associated with each of the two sound signals. The localization of sound sources with systems based on these interaural time and intensity differences is discussed in W. Lindemann, Extension of a Binaural Cross-Correlation Model by Contralateral Inhibition—I. Simulation of Lateralization for Stationary Signals, 80 Journal of the Acoustical Society of America 1608 (December 1986). The localization of multiple acoustic sources based on input from two microphones presents several significant challenges, as does the separation of a desired signal once the sound sources are localized. For example, the system set forth in Markus Bodden, Modeling Human Sound-Source Localization and the Cocktail-Party-Effect, 1 Acta Acustica 43 (February/April 1993) employs a Wiener filter including a windowing process in an attempt to derive a desired signal from binaural input signals once the location of the desired signal has been established. Unfortunately, this approach results in significant deterioration of desired speech fidelity. Also, the system has only been demonstrated to suppress noise of equal intensity to the desired signal at an azimuthal separation of at least 30 degrees. A more intense noise emanating from a source spaced closer than 30 degrees from the desired source continues to present a problem. Moreover, the proposed algorithm of the Bodden system is computationally intense—posing a serious question of whether it can be practically embodied in a hearing aid device.
Another example of a two microphone system is found in D. Banks, Localisation and Separation of Simultaneous Voices with Two Microphones, IEE Proceedings-I, 140 (1993). This system employs a windowing technique to estimate the location of a sound source when there are nonoverlapping gaps in its spectrum compared to the spectrum of interfering noise. This system cannot perform localization when wide-band signals lacking such gaps are involved. In addition, the Banks article fails to provide details of the algorithm for reconstructing the desired signal. U.S. Pat. No. 5,479,522 to Lindemann et al.; U.S. Pat. No. 5,325,436 to Soli et al.; U.S. Pat. No. 5,289,544 to Franklin; and U.S. Pat. No. 4,773,095 to Zwicker et al. are cited as sources of additional background concerning dual microphone hearing aid systems.
Effective localization is also often hampered by ambiguous positional information that results above certain frequencies related to the spacing of the input microphones. This problem was recognized in Stern, R. M., Zeiberg, A. S., and Trahiotis, C. “Lateralization of complex binaural stimuli: A weighted-image model,” J. Acoust. Soc. Am. 84, 156–165 (1988).
Thus, a need remains for more effective localization and extraction techniques—especially for use with binaural systems. The present invention meets these needs and offers other significant benefits and advantages.