Videoconferencing entails exchange of audio, video, and other information between at least two participants. Generally, a videoconferencing endpoint at each participant location will include a camera for capturing images of the local participant and a display device for displaying images of remote participants. The videoconferencing endpoint can also include additional display devices for displaying digital content. In scenarios where more than two endpoints participate in a videoconferencing session, a multipoint control unit (MCU) can be used as a conference controlling entity. The MCU and endpoints typically communicate over a communication network, the MCU receiving and transmitting video, audio, and data channels from and to the endpoints.
FIG. 1 depicts an exemplary multipoint videoconferencing system 100. System 100 can include network 110, one or more multipoint control units (MCU) 106, and a plurality of endpoints 1-5 101-105. Network 110 can be, but is not limited to, a packet switched network, a circuit switched network, or a combination of the two. Endpoints 1-5 101-105 may send and receive both audio and video data. Communications over the network can be based on communication protocols such as H.320, H.324, H.323, SIP, etc., and may use compression standards such as H.263, H.264, etc. MCU 106 can initiate and manage videoconferencing sessions between two or more endpoints. Generally, MCU 106 can mix audio data received from one or more endpoints, generate mixed audio data, and send mixed audio data to appropriate endpoints. Additionally, MCU 106 can receive video streams from one or more endpoints. One or more of these video streams may be combined by the MCU 106 into combined video streams. Video streams, combined or otherwise, may be sent by the MCU 106 to appropriate endpoints to be displayed on their respective display screens. As an alternative, MCU 106 can be located at any one of the endpoints 1-5 101-105.
Combining the video streams is typically based on a specified layout. A layout can be specified for various states and configurations of the video call. For example, the near end display layout for a 2-way call can include the video streams of the only far end videoconferencing device; however, a 3-way video call near end display may include various permutations and combinations of the two far end video streams. Historically, the layouts generated by the MCU for various call scenarios have been either hard-coded into the software running the MCU or have been configured by a system administrator of the MCU. In some cases, a layout is maintained regardless of the roster count (number of sites on a call). In many cases, the admin configuration may be inconsistent with what a user would desire to see in a particular scenario. Historically, changes to the layouts have been cumbersome or impossible for a user to make.
Moreover, whatever user-configurable layout changes were available were not at all persistent, whether within a call, within calls made on the same device, or within calls made on different devices throughout a particular system, for example, all videoconferencing MCUs belonging to an organization. For example, users may have been able to configure certain layout variables such as dual monitor emulation (DME). Often this was done by toggling through existing layouts. Unfortunately, these selections would be lost when another site joined the call. Alternatively, in a bridge call, users might be able to use a far-end camera control feature or a touch screen to manually select the current layout, but it would not scale to the roster number. Additionally, whatever user-configurable layout parameters were available were device-specific, i.e., were stored locally only on the endpoint and/or MCU currently being used by the user. Thus, there has been no way for an admin to create a layout policy or for a user to have his layout preferences follow him from system to system.