1. Field of the Invention
The present invention relates generally to the field of communication devices and, more specifically, to speakerphones.
2. Description of the Related Art
Speakerphones are used in many types of telephone calls, and particularly are used in conference calls where multiple people are located in a single room. Speakerphones may have more than one microphone to pick up voices of in-room participants, and at least one speaker to audibly present voices from offsite participants. While speakerphones may allow several people to participate in a conference call on each end of the conference call, speakerphones may have several disadvantages.
For example, sound from the speaker on the speakerphone may be detected by the microphones on the speakerphone. This feedback path may result in uncontrolled oscillation problems. Some speakerphones may incorporate an acoustic echo cancellation (AEC) algorithm to reduce the closed loop gain of the combined microphone/speaker system to something less than one, and thus, reduce the likelihood of uncontrolled oscillation. For example, acoustic echo cancellation may include filters such as a Generalized Sidelobe Canceller (GSC) structure. Such a filter may model the fixed composite frequency response of a speaker including a direct coupling path between the speaker and a microphone. Other AEC filter implementations may model time-varying characteristics of an indirect coupling path between the speaker and microphone. These and other filters may be used to cancel acoustic echo and prevent uncontrolled oscillation.
These methods may require a minimum amount of settling time. This may be a problem, especially if sound (e.g., from people and microphones) moves around relative to the system. The finite settling time may also be a problem in multi-channel audio output systems where a signal for each channel is allowed to pan between multiple loudspeakers.
In addition, these AEC algorithms operate with non-zero buffer sizes, which imply a time delay between when the signal is acquired by the microphone and when a correction signal may be applied to the speaker output. If an AEC algorithm is operating at least partially in the frequency domain, an amount of frequency resolution attainable may be inversely proportional to the amount of data in the input buffer (and thus, inversely proportional to the time delay). If the time delay is sufficiently long, it may be impossible to control the acoustic echo because the acoustic delay may be shorter than the overall loop delay in the AEC (e.g., a control system group delay). If the buffer size is too small, the frequency resolution may be too coarse to permit the AEC to operate effectively. While the system's sampling rate may be increased in order to increase the number of audio samples available per second, this approach may significantly increase computational overhead.
Certain AEC algorithms may require a minimum physical spacing between the speaker and the microphone system in a full-duplex speakerphone system. The minimum distance may be dependent on the available computational resources, but may also be influenced by factors such as open-loop gain in the signal path between the speaker and the microphone system. The open loop gain may be affected by the acoustic characteristics of the system. These characteristics may include frequency response and directional pattern of the microphones and speakers. For example, transducers (e.g., microphones and speakers) may not have a completely frequency independent directional pattern. For example, a cardioid microphone is a unidirectional microphone with a null pointed at the rear of the cardinal direction (i.e., 180 degrees away from the front of the microphone). However, this null may only be effective over a limited frequency range. The direction pattern may degrade to an omnidirectional pattern at frequencies below approximately 200 Hz and above approximately 8 kHz.
Regarding the speaker, the physical structure used to maintain directional response is typically larger than that required to generate the same response pattern for a microphone, due to the amount of air which must be moved in order to generate a sound that is audible from some distance away. In addition, constraining a loudspeaker directional pattern may have an adverse effect on the perceived sound quality for the desired output (i.e., the speaker may sound different depending on the directional angle between the user and the front of the loudspeaker). If the microphone system has a unidirectional response pattern, there may be some angles where the pickup of external sounds will exhibit different frequency response characteristics than others, which may provide a non-uniform coloration to the sound pickup in the so-called “off-axis” directions. From a computational perspective, it is much easier to generate a directional “beam” from an array of microphones which exhibit very little difference in their directional pattern over a wide frequency range than trying to accomplish the same effect with a set of unidirectional microphones which have a non-uniform directional pattern with respect to frequency.
Typically, only those transducers which exhibit purely pressure (omnidirectional) or purely pressure-gradient (so-called “bidirectional” or “figure of eight”) response will maintain a uniform response pattern over wide frequency ranges. However, a purely pressure-gradient microphone will exhibit a very different frequency-dependent sensitivity in the near field than it does in the far field (the so-called “proximity effect”), so it can sound quite different in those two cases. Similarly, a purely pressure-gradient speaker system can suffer from a weak low-frequency response, depending on the size of the baffle (the surface which separates the front of the speaker driver from the rear).
Purely pressure (omnidirectional) transducer systems do not suffer these particular disadvantages. Thus, an omnidirectional pattern may thus be the most effective (and natural sounding) for both the microphone and the speaker system, since an omnidirectional pattern may be the closest to frequency-independent operation. Many transducers may be naturally omnidirectional except at frequencies where the size of the radiator is much larger than the wavelength. For microphones this may not present a problem except at the highest frequencies. For speakers, however, it may be more challenging to produce a transducer that is small in size (for wide dispersion) and able to displace enough air in order to generate an acceptable acoustic output. While omnidirectional patterns for speakers and microphones may be desirable, such patterns may be challenging in a speaker/microphone coupling, since this configuration is the one which presents the largest closed-loop gain in an otherwise uncorrected microphone-speaker system.
Another concern with such microphone-speaker systems is in how to best approximate the signal presented at the microphone input by the output of the speaker. In the traditional AEC-corrected system, this microphone-speaker transfer function is approximated internally by the AEC algorithm. However, the non-linearity of the system cannot be completely corrected by a linear model. Thus, the AEC algorithm performance (the amount of closed-loop gain that it can effectively cancel) can suffer if the speaker system produces a highly non-linear output from a given input. Since speaker systems are more linear at lower excursions (volume levels), this can limit the maximum output level that the speaker can produce without incurring unwanted feedback.
Another problem that can be associated with a closely coupled microphone-speaker system is that the speaker is typically much closer to the microphone (in the sense of spatial distance) than the sound sources (conference participants) present in the outside room. This, coupled with the inverse-square radiation law, produces a much more intense sound field at the microphone input due to the speaker than an equally loud sound source located in the surrounding environment. However, it is desirable for the microphone system to be able to respond to a wide range of loudness levels from external sound sources, so in the case where such an external source is quiet, the microphone system sensitivity may be quite high. This typically involves a large gain in the microphone preamplifier stage of the system. However, in that case, the high gain of the system may cause unwanted overloading of the microphone preamplifier stage due to the excessively strong response in this microphone system from the output of the speaker system.
Powering a speakerphone may also present several problems. For example, power cords to the speakerphone may be cumbersome. In addition, batteries may be expensive and short lived which could cause problems in long conference calls.