The current audio conferencing systems generally work on one sound channel or dual sound channels and do not provide the sense of presence. In the case of a multi-point conference, in particular, the sounds from all sources are mixed and, as a result, the clearness of the sounds declines.
In a prior art, audio streams of an audio conference are processed through 3D audio processing. That is, the gain on the left and right sound channels of an audio stream is adjusted according to the sound image position allocated for the audio stream and the spatial relationship between the audio streams in different sound image positions so as to create a stereo effect.
The prior art provides a distributed network structure for 3D audio conferencing, where, every terminal receives the conference data from all other terminals and performs 3D positioning on all the audio data so that the user feels that different audio streams come from different positions. As shown in FIG. 1, terminal 2 receives the conference data of terminal 1 and terminal 3 and then performs 3D positioning on the audio data to determine the positions of terminal 1 and terminal 3. Another solution in the prior art adopts centralized networking. The conferencing system shown in FIG. 2 includes one server and multiple terminals. All terminals send their audio data to the server and the server performs 3D positioning on the audio streams sent to each terminal and then sends the processed audio streams to the appropriate terminals.
During the implementation of the present invention, the inventor finds at least the following weaknesses in the prior art: regarding the distributed 3D audio conferencing solution, because audio data is processed on the distributed terminals, there must be many transmission channels and therefore, the solution is applicable to only small conferencing systems with a few conference sites; regarding the centralized 3D audio conferencing solution, because all data processing is carried out on the server, the server must know the player configuration of all terminals in advance and a terminal cannot determine the sound image positions of other terminals freely.