In the context of audio conferences, a number of audio stream exchange configurations can be envisaged:                centralized multipoint connections where a sending entity in the network, generally called the “bridge”, manages audio streams sent to the receiver terminals of the participants;        distributed multipoint connections where each terminal receives and processes audio streams from the terminals of other participants and sends its own audio stream to them;        point-to-point connections where only two terminals are in communication, each playing the role of sender entity and receiver terminal.        
At present, the terminals of audio conference participants can provide a number of spatial reproduction formats, for example:                the binaural format for 3D spatial listening via headphones;        the stereo dipole format for 3D spatial listening via two loudspeakers;        the surround sound format for 2D or 3D spatial listening via several loudspeakers;        multichannel (5.1, 7.1, etc.) audio formats for 2D spatial listening via several loudspeakers.        
During an audio conference the sender entity, such as the bridge of a centralized multipoint connection, spatially encodes the received audio streams to generate a virtual audio sound scene in two dimensions, in the horizontal plane, or in three dimensions, in space. To this end, the sender entity uses a particular mode of spatial coding of the audio data, which can be the binaural, stereo dipole, etc. coding mode. The audio streams spatially coded in this way by the sender entity are then transmitted to the receiver terminals via audio coders-decoders (codecs) chosen as the result of a standard VoIP negotiation procedure. At present monophonic and stereophonic audio codecs are available.
It is therefore possible to transmit binaural or stereo dipole spatial coding using a stereophonic audio codec. However, the receiver terminal cannot identify the spatial, for example binaural or stereo dipole, content of the data. At present, audio data spatially coded by existing audio codecs is processed in the same way as an ordinary mono or stereo audio stream with no spatial information.
US patent application 2002/0138108 clearly illustrates this situation where a sender entity consisting of a bridge takes account only of how the participants in the audio conference are to be distributed in space, in audio images specific to each participant or common to them all, but without knowing which spatial coding format (binaural, stereo dipole, surround sound, etc.) to use.
The problems that then arise are shown in FIG. 1, which is a diagram of the streams exchanged over a prior art centralized multipoint connection.
FIG. 1 shows a bridge serving as the sender entity and three receiver terminals that transmit their own audio stream to the bridge using monophonic audio coding. The bridge decodes the received audio streams and then effects spatial coding, for example binaural spatial coding.
The terminals 1 and 2 are provided with respective headsets appropriate for listening to binaural signals and the terminal 3 is equipped with two loudspeakers for reproducing the stereo dipole format.
In this configuration, the binaural spatial coding mode arbitrarily selected by the bridge is not directly compatible with the stereo dipole reproduction format used by the terminal 3, which must then convert from the binaural coding to stereo dipole coding, which is reflected in a CPU load and delay overhead incurred by the central processor unit of the terminal.
Clearly, the problems described with reference to FIG. 1 are generally the result of the fact that a receiver terminal, such as a personal computer (PC), can have a number of peripherals, such as a headset, any number of loudspeakers, etc., associated with different reproduction formats, and that the remote applications installed on a sender entity connected to the terminal via the network are also obliged to make arbitrary choices for spatial coding, independent of the reproduction format configured by the user of the receiver terminal.