1. Field of the Invention
The present invention relates generally to audio communication systems, and more specifically, to techniques for separating speech from ambient acoustic noise.
2. Background of the Invention
The problem of separation of speech from one or more persons speaking in a room or other environment is central to the design and operation of systems such as hands-free telephone systems, speaker phones and other teleconferencing systems. Further, the separation of speech from other sounds in an ambient acoustic environment, such as noise, reverberation and other undesirable sounds such as other speakers can be usefully applied in other non-duplex communication or non-communication environments such as digital dictation devices, computer voice command systems, hearing aids and other applications in which reduction of sounds other than desired speech provides an improvement in performance.
Processing systems that separate desired speech from undesirable background sounds and noise may use a single microphone, or two or more microphones forming a microphone array. In single microphone applications, the processing algorithms typically rely entirely on source-attribute filtering algorithms that attempt to isolate the speech (source) algorithmically, for example computational auditory scene analysis (CASA). In some implementations, two or more microphones have been used to estimate the direction of desired speech. The algorithms rely on separating sounds received by the one or more microphones into types of sounds, and in general are concerned with filtering the background sound and noise from the received information.
However, when practical, a microphone array can be used to provide information about the relative strength and arrival times of sounds at different locations in the acoustic environment, including the desired speech. The algorithm that receives input from the microphone array is typically a beam-forming processing algorithm in which a directivity pattern, or beam, is formed through the frequency band of interest to reject sounds emanating from directions other than the speaker whose speech is being captured. Since the speaker may be moving within the room or other environment, the direction of the beam is adjusted periodically to track the location of the speaker.
Beam-forming speech processing systems also typically apply post-filtering algorithms to further suppress background sounds and noise that are still present at the output of the beam-former. However, until recently, the source-attribute processing techniques were not used in beam-forming speech processing systems. The typical filtering algorithms employed are fast-Fourier transform (FFT) algorithms that attempt to isolate the speech from the background, which have relatively high latency for a given signal processing capacity.
Since source-attribute filtering techniques such as CASA rely on detecting and determining types of the various sounds in the environment, inclusion of a beam-former having a beam directed only at the source runs counter to the detection concept. For the above reason, combined source-attribute filtering and location-based techniques typically use a wideband multi-angle beam-former that separates the scene being analyzed by angular location, but still permits analysis of the entire ambient acoustic environment. The wideband multi-angle beam-formers employed do not attempt to cancel all signals other than the direct signal from the speech source, as a narrow beam beam-former would, and therefore loses some signal-to-noise-ratio reduction by not providing the highest possible selectivity through the directivity of a single primary beam.
Therefore, it would be desirable to provide improved techniques for separating speech from other sounds and noise in an acoustic environment. It would further be desirable to combine source-attribute filtering with narrow band source tracking beam-forming to obtain the benefits of both. It would further be desirable to provide such techniques with a relatively low latency.