Voice communication devices such as cell phones, wireless phones and devices other than cell phones have become ubiquitous; they show up in almost every environment. These systems and devices and their associated communication methods are referred to by a variety of names, including but not limited to, cellular telephones, cell phones, mobile phones, wireless telephones and devices such as Personal Data Assistants (PDAs) that include a wireless or cellular telephone communication capability. Such devices are used at home, office, inside a car, a train, at the airport, beach, restaurants and bars, on the street, and almost any other location. As to be expected, such diverse environments have relatively higher or lower levels of background, ambient, or environmental noise. For example, there is generally less noise in a quiet home as compared to a crowded bar or nightclub. If ambient noise, at sufficient levels, is picked up by a microphone, the intended voice communication degrades and though possibly not known to the users of the communication device, consumes more bandwidth or network capacity than is necessary, especially during non-speech segments in a two-way conversation when a user is not speaking.
A cellular network is a radio network made up of a number of radio cells (sometimes referred to as “cells”) each served by a fixed transmitter, commonly known as a base station. The radio cells or cells cover different geographical areas in order to provide coverage over a wider geographical area than the area of one sole cell. Cellular networks are inherently asymmetric with a set of fixed main transceivers each serving a cell and a set of distributed (generally, but not always, mobile) transceivers which provide services to the network's users.
The primary requirement for a cellular network is that each of the distributed stations must distinguish signals from their own transmitter and signals from other transmitters. There are two common solutions to this requirement: Frequency Division Multiple Access (FDMA) and Code Division Multiple Access (CDMA). FDMA works by using a different frequency for each neighboring cell. By tuning to the frequency of a chosen cell, the distributed stations can avoid the signals from other neighbors. The principle of CDMA is more complex, but achieves the same result; the distributed transceivers can select one cell and listen to it. Other available methods of multiplexing such as Polarization Division Multiple Access (PDMA) and Time Division Multiple Access (TDMA) cannot be used to separate signals from one cell to the other since the effects of both vary with position, which makes signal separation practically impossible. Orthogonal Frequency Division Multiplexing (OFDM), in principle, consists of frequencies orthogonal to each other. TDMA, however, is used in combination with either FDMA or CDMA in a number of systems to give multiple channels within the coverage area of a single cell.
Wireless communication includes, but in not limited to two communication schemes: time based and code based. In the cellular mobile environment these techniques are named as TDMA (Time Division Multiple Access) which comprises, but not limited to the following standards GSM, GPRS, EDGE, IS-136, PDC, and the like; and CDMA (Code Division Multiple Access) which comprises, but not limited to the following standards: CDMA One, IS-95A, IS-95B, CDMA 2000, CDMA 1×EvDv, CDMA 1×EvDo, WCDMA, UMTS, TD-CDMA, TDS-DMA, OFDM, WiMax, WiFi, and others).
For the code division based standards or the orthogonal frequency division, as the number of subscribers grow and average minutes per month increase, more and more mobile calls typically originate and terminate in noisy environments. The background or ambient noise degrades the voice quality.
For the time based schemes, like GSM, GPRS and EDGE schemes, improving the end-users signal-to-noise ratio (SNR), improves the listening experience for users of existing TDMA based networks. This is done by improving the received speech quality by employing background noise reduction or cancellation at the sending or transmitting device.
Significantly, in an on-going cell phone call or other communication from an environment having relatively higher environmental noise, it is sometimes difficult for the party at the receiving end of the conversation to hear what the party in the noisy environment is saying. That is, the ambient or environmental noise in the environment often “drowns out” the cell phone user's voice, whereby the other party cannot hear what is being said or even if they can hear it with sufficient volume the voice or speech is not understandable. This problem may even exist in spite of the conversation using a high data rate on the communication network.
Attempts to solve this problem have largely been unsuccessful. Both single microphone and two microphone approaches have been attempted. For example, U.S. Pat. No. 6,415,034 to Hietanen et al patent describes the use of a second background noise microphone located within an earphone unit or behind an ear capsule. Digital signal processing is used to create a noise canceling signal which enters the speech microphone. Unfortunately, the effectiveness of the method disclosed in the Hietanen patent is compromised by acoustical leakage, that is where the ambient or environmental noise leaks past the ear capsule and into the speech microphone. The Hietanen patent also relies upon complex, power consuming, and expensive digital circuitry that may generally not be suitable for small portable battery powered devices such as pocket cellular telephones.
Another example is U.S. Pat. No. 5,969,838 (the “Paritsky patent”) which discloses a noise reduction system utilizing two fiber optic microphones that are placed side-by-side next to one another. Unfortunately, the Paritsky patent discloses a system using light guides and other relatively expensive and/or fragile components not suitable for the rigors of cell phones and other mobile devices. Neither Paritsky nor Hietanen address the need to increase capacity in cell phone-based communication systems.
U.S. Pat. No. 5,406,622 to Silverberg et al uses two adaptive filters, one driven by the handset transmitter to subtract speech from a reference value to produce an enhanced reference signal; and a second adaptive filter driven by the enhanced reference signal to subtract noise from the transmitter. The Silverberg patent requires accurate detection of speech and non-speech regions. Any incorrect detection will degrade the performance of the system.
Previous approaches in noise cancellation have included passive expander circuits used in the electret-type telephonic microphone. These, however, suppress only low level noise occurring during periods when speech is not present. Passive noise-canceling microphones are also used to reduce background noise. These have a tendency to attenuate and distort the speech signal when the microphone is not in close proximity to the user's mouth; and further are typically effective only in a frequency range up to about 1 kHz.
Active noise-cancellation circuitry to reduce background noise has been suggested which employs a noise-detecting reference microphone and adaptive cancellation circuitry to generate a continuous replica of the background noise signal that is subtracted from the total background noise signal before it enters the network. Most such arrangements are still not effective. They are susceptible to cancellation degradation because of a lack of coherence between the noise signal received by the reference microphone and the noise signal impinging on the transmit microphone. Their performance also varies depending on the directionality of the noise; and they also tend to attenuate or distort the speech.
Thus, there is a need in the art for a method of noise reduction or cancellation that is robust, suitable for mobile use, and inexpensive to manufacture. The increased traffic in cellular telephone based communication systems has created a need in the art for means to provide a clear, high quality signal with a high signal-to-noise ratio. The requirements of a noise reduction system for speech enhancement include but are not limited to intelligibility and naturalness of the enhanced signal, improvement of the signal-to-noise ratio, short signal delay, and computational simplicity
There are several methods for performing noise reduction, but all can be categorized as types of filtering. In the related art, speech and noise are mixed into one signal channel, where they reside in the same frequency band and may have similar correlation properties. Consequently, filtering will inevitably have an effect on both the speech signal and the background noise signal. Distinguishing between voice and background noise signals is a challenging task. Speech components may be perceived as noise components and may be suppressed or filtered along with the noise components.
Even with the availability of modern signal-processing techniques, a study of single-channel systems shows that significant improvements in SNR are not obtained using a single channel or a one microphone approach. Surprisingly, most noise reduction techniques use a single microphone system and suffer from the shortcoming discussed above.
One way to overcome the limitations of a single microphone system is to use multiple microphones where one microphone may be closer to the speech signal than the other microphone. Exploiting the spatial information available from multiple microphones has lead to substantial improvements in voice clarity or SNR in multi-channel systems. However, the current multi-channel systems use separate front-end circuitry for each microphone, and thus increase hardware expense and power consumption.
Hence, there is a room in the art for new means and methods of increasing SNR in hand-held devices that capture sound with multiple microphones but use the circuitry or hardware of a single channel system. Adaptive noise cancellation is one such powerful speech enhancement technique based on the availability of an auxiliary channel, known as reference path, where a correlated sample or reference of the contaminating noise is present. This reference input is filtered following an adaptive algorithm, in order to subtract the output of this filtering process from the main path, where noisy speech is present.
As with any system, the two microphone systems also suffer from several shortfalls. The first shortfall is that, in certain instances, the available reference input to an adaptive noise canceller may contain low-level signal components in addition to the usual correlated and uncorrelated noise components. These signal components will cause some cancellation of the primary input signal. The maximum signal-to-noise ratio obtained at the output of such noise cancellation system is equal to the noise-to-signal ratio present on the reference input.
The second shortfall is that, for a practical system, both microphones should be worn on the body. This reduces the extent to which the reference microphone can be used to pick up the noise signal. That is, the reference input will contain both signal and noise. Any decrease in the noise-to-signal ratio at the reference input will reduce the signal-to-noise ratio at the output of the system. The third shortfall is that, an increase in the number of noise sources or room reverberation will reduce the effectiveness of the noise reduction system.