In general any audio signal can be described as the sum of one or more desired signals, plus one or more noise sources, plus any reverberation associated with each of these desired signals and noise sources. For example, consider a person talking in a room with a radio playing, as well as other noise sources such as a computer or an air conditioning unit. If we place a microphone at some location in the room we will capture a mixture of all of the sound sources in the room. In many situations, the relative mixture of these various components of the signal is not suitable for a given application. However, once the sound sources have been mixed together and picked-up by the microphone, it is extremely difficult to extract a desired signal while suppressing the other sound sources. We therefore seek a means of altering the mixture of the components in order to make the signal more suitable to the application.
There are many situations where it is desirable to be able to extract a desired audio signal from a mixture of audio signals. In the above scenario we may wish to be able to isolate the sound of the talker while removing the other sounds as well as the reverberation. For example, in surveillance and security applications it is desirable to be able to isolate the sound of the talker in order to increase the intelligibility of what is being said. One way to better isolate the talker's voice is to somehow place the microphone closer to talker, however, this may not be practical or possible in many cases. Another approach is to use a directional microphone. Directional microphones are more sensitive to sounds arriving from some directions versus other directions. A highly directional (shotgun) microphone or an array of microphones can be used to zoom-in on the desired talker (and extract his voice) from a distance. While this can work very well in certain situations, these types of microphones tend to be large and bulky, and therefore not easily concealed. Therefore, it is desirable to have a system that provides the same signal extraction capabilities as a highly directional microphone but can be very small in size. Most microphones are not able to adequately separate sounds that are arriving from nearby sound sources versus those due to sound sources that are further away from the microphone. It is desirable to have a system that is able to select or suppress sound sources based on their distance from the microphone.
Moving-picture cameras such as camcorders record sound along with the image. This also applies to some security and surveillance cameras, as well as to certain still-picture cameras. In most cameras the user can adjust the amount of optical zoom in order to focus the image onto the desired target. It is desirable to also have a corresponding audio zoom that would pick up only the sound sources associated with the image. Some cameras do offer this ability by employing a microphone system with variable directivity but, unless the system is rather large in size, it may be very limited in the degree to which it can zoom-in. Therefore, such systems are often inadequate in their ability to select the desired sounds while rejecting unwanted sounds. Also, these microphone systems can be very susceptible to wind noise, causing the recorded signal to become distorted. It is desirable to have a small audio zoom system that matches the abilities of the optical zoom, thereby eliminating unwanted sounds and reducing reverberation. It is also desirable for this system to reduce the noise due to the camera itself.
In hearing aids, sounds are picked up by a microphone and the resulting signal is then highly amplified and played into the user's ear. One common problem with hearing aids is that they do not discriminate between the desired signal and other sound sources. In this case noise sources are also highly amplified into the user's ear. To partially alleviate this problem some hearing aids include a noise reduction circuit based on a signal processing method known as spectral subtraction. Typically such noise reduction circuits are only effective at removing steady noises such as an air conditioner, and do not work well at suppressing noises that are dynamically changing. A key limitation of the spectral subtraction noise reduction method is that it often distorts the desired signal and creates audible artifacts in the noise-reduced signal. Furthermore, while this approach may reduce the perceived loudness of the noise, it does not tend to provide any improvement in speech intelligibility, which is very important to hearing aid users.
Another method used to reduce unwanted noises in hearing aids is to use a directional microphone. In the hearing aid application a microphone with a cardioid directional pattern might be used. The cardioid microphone is less sensitive to sounds arriving from behind as compared to sounds arriving from the front. Therefore, if the hearing aid user is facing the desired sound source then the cardioid microphone will reduce any unwanted sound sources arriving from behind. This will help increase the level of the desired signal relative to the level of the unwanted noise sources. Another advantage of the directional microphone is that it reduces the amount of reverberation that is picked up. Excessive reverberation is known to reduce speech intelligibility. In hearing aids a directional microphone pattern is usually derived by processing the output signals from two omnidirectional (i.e., non-directional) microphones. This limits how selective the directional microphone can be. That is, it is limited in how much it can zoom-in on the desired signal and in how much the unwanted noises can be suppressed in comparison to the desired signal, thereby making this approach less effective in higher noise environments. A more selective directional microphone pattern could be obtained by using more than two omnidirectional microphones; however this is not typically practical due to the physical size limitations of the hearing aid. So, while a directional microphone can be advantageous, its benefit is limited and may not be adequate in many situations. A traditional directional microphone will also tend to amplify the user's own voice into the hearing aid, which is not desirable.
One common problem with traditional directional microphones is that they can be very susceptible to wind noise, causing the desired signal to be distorted and unintelligible.
Another common problem in hearing aids is that of acoustic howling due to the very high amounts of amplification between the microphone and earpiece. This acoustic howling is very disturbing and painful to the hearing aid user. A carefully chosen directional microphone may help mitigate this problem to some extent, but typically some form of adaptive echo canceling circuit is also required. However, such circuits often fail to completely eliminate the acoustic howling.
Therefore, in hearing aid applications we would like a means of selectively amplifying desired signals while suppressing undesired noises and reverberation. The method should be able to suppress all types of unwanted sounds and should have significantly better selectivity than is possible with traditional directional microphones. It would be very helpful if this method could also help to reduce acoustic howling. We would also like the new method to be relatively insensitive to wind noise. Furthermore, we would like a means of suppressing the hearing aid user's own voice.
Headsets are widely used in many applications for two-way voice communications. The headset includes a microphone to pick up and transmit the user's voice. However, there are many situations where the microphone also picks up other sounds, which is undesirable. In call centers there can be numerous operators talking in close proximity to each other, and the microphone can pick up the sound of the other talkers. Headsets are becoming increasingly popular for cell phone use since they allow the user's hands to be free to do other things. The headset can be connected to the cell phone via a wire, or through a wireless technology such as BLUETOOTH. In this application, the headset is used in a broad variety of acoustic environments including, cars, restaurants, outdoors, airports, boats, and offices. These varying acoustic environments introduce various types and levels of noise, as well as reverberation that are picked up by the headset microphone. Two general approaches have traditionally been employed to try to reduce the level of the noise picked up by the headset microphone. One approach is to place the microphone on a boom so that it is positioned as close as possible to the user's mouth. While this approach can help to reduce the level of the noise and reverberation, it may not be adequate in higher noise (or highly reverberant) environments. For example, this approach would not sufficiently remove the noise picked up when the headset is used in a car. Moreover, the boom can be very disturbing to the user. Another approach is to use a traditional directional microphone, which is also inadequate in higher noise environments. Also, the traditional directional microphone is highly susceptible to wind noise making it unusable in many situations.
Adaptive noise canceling microphones have been tried on communications headsets in high-noise environments (such as military or industrial settings). This approach uses two or more microphones and tries to cancel out the background noise while preserving the desired speech signal. This approach is limited to providing about 10 dB of noise reduction, which is not adequate in many situations. It requires knowledge beforehand of the location of the desired speech signal. Due to its adaptive nature, its performance can be variable and has been found to deteriorate in situations where there are multiple noise sources.
The audio quality of cell phones often deteriorates quickly in the presence of background noise. This problem is aggravated by the user's desire to have a cell phone that is as small as possible. The result is that the microphone is located further away from the user's mouth. Directional microphones can be used to help alleviate this problem but they are often inadequate for the task. Spectral subtraction based noise reduction circuits can be used but they often do not provide sufficient noise reduction and can cause annoying artifacts on the processed speech signal. Therefore, there is a need for a system of adequately removing noise and reverberation from the speech signal on cell phones.
So called handsfree phones are often used for conference calls where there are multiple talkers in the same room. Handsfree phones are increasingly being used in cars for safety reasons. One key problem with typical handsfree phones is that they don't only pick up the desired talker, but also various noises and reverberation. In a car application, the level of the noise can be quite severe, and may include wind noise. Also, when there are several talkers in the room or car, the handsfree phone will typically pick up all of the talkers. This may not always be desirable. For example, in the car example, it may be desirable to only pick up the driver's voice. A directional microphone can be used, or the microphone can be placed closer to the talker. However, this may not always be practical or desirable, and in most cases will not sufficiently reduce the noise and reverberation. Another potential problem with handsfree phones is that echo and howling can occur when the sound from loudspeaker is picked up by the microphone. To address these problems an improved method is required for isolating the desired talker's voice while significantly attenuating all other sounds.
Speech signals are frequently processed in many ways. For example in cell phones the speech signal is processed by a sophisticated codec in order to compress the amount of data being transmitted and received over the phone network. Similarly, in VOIP (voice over Internet protocol) applications, speech signals are also compressed by a codec in order to be transmitted over the Internet. In order to maximize the amount of compression while maintaining acceptable audio quality, special codecs are used that are highly tuned to the properties of speech. These codecs work best when the speech signal is relatively free from noise and reverberation. Similarly, the performance of speech recognition (speech-to-text) systems and voice recognition systems (for security purposes) often deteriorates quickly in the presence of background noise and reverberation. These systems are often used in conjunction with a desktop or laptop computer, which can itself be the source of significant noise. To help alleviate these problems, users are often forced to find some way of placing the microphone very close to their mouth. This may not be convenient in many situations, and in highly noisy or reverberant environments this may still be inadequate and so the speech processing system may not work as well as intended. In numerous applications, a method is needed to remove unwanted noises and reverberation in order to clean up the speech signal prior to some further processing.
In karaoke applications, the user sings along to a recording of the instrumental version of the song. Processing is often applied to the singer's voice in order to improve its quality and to correct the singer's pitch. To operate correctly, these processors rely upon having a clean version of the singer's voice. Any leakage of the recorded instruments into the microphone can cause the voice processor to incorrectly process the singer's voice. A directional microphone can be used to help reduce this leakage, but its performance is often inadequate. A better method of capturing the singer's voice while rejecting the recorded instruments is required.
Public address (PA) systems are used to amplify sounds for an audience. PA systems are used in a broad range of applications including churches, live music, karaoke, and for all forms of public gatherings. A PA system works by picking up the desired sound with a microphone and then amplifying that sound through loudspeakers. A common problem with PA systems occurs when the amplified sound is picked up the microphone and then further amplified. This can cause the PA system to become unstable, resulting in very disturbing howling. This problem can be reduced in certain extent by using traditional directional microphones such as a cardioid microphone. However, this may not work in many cases due to the relative placement of the microphone and the loudspeakers. Therefore, the reduction in howling due to a traditional directional microphone is not adequate in many situations. It is highly desirable to have a microphone system that could effectively eliminate howling in all situations.
When making musical recordings of singers and acoustical instruments, traditional directional microphones are frequently used in order to emphasize certain parts of a sound field, suppress certain other parts of a sound field, or control the amount of reverberation that is picked up. This approach is limited since the relative amounts of emphasis, suppression, and reverberation cannot be arbitrarily controlled simultaneously. In general there is a desire to have a microphone system that can arbitrarily emphasize certain parts of a sound field while simultaneously suppressing other parts.
Traditional directional microphones permit sound sources located at specific angles to be suppressed, but they don't do well at separating sound sources that are nearby versus those that are further away. In many of the applications described above it would be extremely beneficial to be able to distinguish between sound sources based on their position and distance with respect to the microphone. Moreover, traditional directional microphones work better at removing a particular sound source, as opposed to extracting and isolating a given source from within a mixture of sounds. In general, there is a need to be able to isolate and separate sounds sources into different signal streams based on their direction and distance. The individual signal streams could then be altered and recombined as desired in order to meet the specific requirements of a given application.