Any discussion of the background art throughout the specification should in no way be considered as an admission that such art is widely known or forms part of common general knowledge in the field.
Recently, the utilisation of audio conferencing systems has become increasingly popular. These audio conferencing systems are adapted to provide multi-party audio conferences where many participants participate in an audio conference through interaction with an audio conferencing server.
Increasingly, such systems have been utilised in conjunction with an Internet VOIP type environment to provide for complex distributed audio conferencing facilities.
With any such audio conferencing system, a number of assumptions normally exist including having a series of listeners or participants at geographical dispersed locations with each listener participant having audio input facilities such as microphones or the like in addition to audio output facilities such as speakers, headphones or the like for listening to other participants. The audio input and output devices are normally interconnected by means of an electronic audio signalling path, and often, although not restricted to, a central server. The central server is responsible for managing the incoming audio from each of the endpoints and then creating a suitable mix or combination of audio streams to return to each endpoint that generally will include all or most relevant other incoming audio other than that generated from the particular endpoint for which the mix is created. An alternate to a server may be a system of distributed or allocated hosting of the logic and mixing, in order to achieve the same outcome of a suitable audio mix to be sent to each client.
A general assumption of such systems is that each endpoint is acoustically isolated, and therefore there is no sense that any endpoint can hear or be heard by another endpoint by local acoustic path. This is typically satisfied in conference systems where users join the meeting from different geographic locations or even from separate rooms within the same office environment.
Unfortunately, such systems are prone to a number of problems when the assumption of participants being acoustically isolated is not met such as when different participants join the conference from proximal cubicles in an open plan office. In these circumstances, there is a propensity for audio coupling between two or more endpoints of either the local activity and/or output audio of the conferencing system. This can lead to various problems including the proximal participants being presented with or hearing multiple streams of the same or similar audio with different delays or latency. It is very difficult for a user to understand speech that consists of the direct, or intended, stream and one or more delayed copies of the same signal that overlap in time with the original. Very short delays where the additional signals are significantly lower level than the original can be tolerated, such as is the case for reverberant signals. However, a particular problem arises when multiple participants each with their own microphone and speaker facilities are in close proximity to one another. For example, in such an arrangement, a first participant is likely to be able to receive the direct acoustic emission from a closely spaced second conference participant directly, in addition to receiving the same audio, but delayed, via the audio server. In such conferencing systems, the delay between the direct and mediated audio is typically of the order of 100 ms to 500 ms, which is particularly problematic and distracting to the user.
Four possible paths of secondary or duplicate audio that may cause problems are illustrated in FIG. 1 and described below:                Path 1 From the mouth of User A to the ears of User B        Path 2 From the mouth of User A to the microphone of User B        Path 3 From the speaker of User A to the ears of User B        Path 4 From the speaker of User A to the microphone of User B.        
All of the above paths can also be present in the return from User B to User A.
Such arrangements often lead to an unnatural and disconcerting conference experience where the participants find it difficult to communicate efficiently.