Human process sounds received at two ears, which perform similar to microphones. The sounds is processed to distinguish different sound sources both by types of the sound sources and location of the sound sources. The human brain performs this processing, which may be referred to as binaural processing, in an efficient and effective manner.
Computer science, on the other hand, has not yet mastered binaural processing. Instead of two ears, computer science techniques rely on many more microphones in order to estimate the location of a sound source. When multiple sound sources are present, the problem becomes increasingly difficult. In addition, where there is a loudspeaker, which is likely the case in videoconferencing environments, the speaker is often located very close to the microphones.
The loudspeaker is a constant noise source because it plays room and comfort noise from the far end participants. The noise may not be heard by the local participants because the local participants are likely located a couple meters or more away from the constant noise source. However, the microphones may only be a few centimeters from the constant noise source. Thus, the constant noise source may significantly impact calculations for estimating the source of the local sound (i.e., the local participants) based on the output of the microphones.