Multipoint videoconferencing is a natural extension of point-to-point video conferencing. Multipoint videoconferencing usually includes a multipoint video bridge combining the video signals from multiple videoconference endpoints to provide a single output video signal which can be displayed to and shared by all the participants. When there are a large number of participants in the videoconference, multipoint systems have difficulty maintaining an accurate perspective of the entire videoconference. Ideally, a participant should be able to view all other participants at the other endpoints. However, because of limited display space and a potential for a large number of participants, it is not always possible to display the video images of all participants in an image size that is useful to the viewers.
To account for this problem, designers have relied on many different methods. One prior art method is to limit the number of participants displayed at any one endpoint such that each image is large enough to be beneficial to the participants viewing them. As a participant speaks, her image is displayed at the other endpoints, replacing an existing image of a different participant. While this method has the advantage of displaying video of participants in an image size that is useful to other participants, it creates other problems. Because participants are not able to see all other participants at one time, a speaker must frequently address someone she cannot see. A speaker would often ask the person she is addressing to speak as a way of “tricking” the view switching algorithm, which may be based on audio activity, to switch the video image to the addressee.
Another prior art method used to deal with a large number of participants is to display the image of all participants. This “Hollywood Squares” approach, while giving participants the opportunity to see everyone in a videoconference, has its own problems. As the number of participants increases, the size of the individual images decreases making it more difficult for a participant to figure out who among the sea of faces is actually speaking.
While the current methods provide for some level of perspective in a multipoint videoconference, they do not create the perception that all participants are in the same room and leave speakers and audience members searching their displays for the right image.
Therefore, what is desired is a system and method that overcomes challenges found in the art, including a method for creating a more realistic videoconference environment allowing participants to all see each other at the same time and make eye contact with other participants without creating a “Hollywood Squares” effect.