Modern conferencing systems facilitate communications among multiple participants over telephone lines, Internet protocol (IP) networks, and other data networks. In a typical conferencing session, a participant enters the conference by using an access number. During a typical conference session a mixer receives audio and/or video streams from the participants, determines the N loudest speakers, mixes the audio streams from the loudest speakers and sends the mixed media back to the participants.
One drawback of existing conferencing systems is that the participants often lack the ability to observe each other's body language. By its nature, body language communication is bidirectional; that is, it allows a listener to convey his feelings while he is listening to a speaker. For audio-only participants, this means that a speaker is unable to see the facial expressions, head nods, arm gesticulations, etc., of the other participants.