Any discussion of the background art throughout the specification should in no way be considered as an admission that such art is widely known or forms part of common general knowledge in the field.
In voice conferencing applications with multiple talkers talking around a single endpoint, the ability to capture the voice of an individual is important to the intelligibility and quality of the conferencing experience.
In an example arrangement, as illustrated in FIG. 1, an audio conference 1 is carried out, where a series (e.g., set) of participants 2, 3, 4 are positioned around a Conferencing Audio Input/Output Device 6. The Device 6 is interconnected to a networking environment 7 for the transmission of the audio conversation.
Typically, the conferencing audio input/output device 6 includes one or more microphones e.g. 9. Where multiple microphones (e.g., an array of microphones) are provided, there exists opportunities for improving the voice capture through beamforming or beamsteering of the microphones.
Beamforming is the process by which a signal or signals are captured by multiple microphones, and in order to capture the best quality signal for a given source or sound of interest, some linear combination of the microphones is selected in order to maximize the signal to noise ratio. Traditionally beamforming aims to optimize for a current talker. It virtually steers a directional beam towards the most salient talker at a particular instance in time in the hope that it will improve the quality and clarity of pick up. In voice beamforming applications, this is typically achieved by looking for the direction which contains the most energy.
Since instantaneous estimates (or small frames of speech) are typically noisy, this signal can be smoothed with a low pass filter to stabilize the estimate.
While beam forming offers benefits in single talker pick up, the reality is that the majority of conferences contain multiple talkers who occasionally talk and sometimes talk simultaneously. This greatly impacts the quality of the beamformed signal, often resulting in a person being relatively inaudible for brief periods of time until the beamformer determines a correct course of action.
Beamforming can be seen to have two benefits when considered in a room or reverberant environment. One aspect of beam forming is to improve the isolation of the desired sound to undesired audio and noise coming from other directions. The beam selective process focuses on the desired sound object using the linear combination of the microphones suited to the pattern of signal response that object creates at the microphones.
In addition to noise, a critical problem in rooms and internal spaces is reverberation. This is effectively a later arrival of sound from a wide range of directions at the microphone. In such a situation, there is a direction that can be identified for the early sound energy, and steering a beam in this direction is advantageous as the diffuse reverberant energy is decreased. The ideas behind beamforming for selective source capture, and dereverberation are generally known in the art.