This invention is related to multimedia communications systems, and in particular to a method and apparatus for using Far End Camera Control (FECC) messages to implement participant selection, layout selection, and/or participant-to-participant camera control in a multipoint videoconference, e.g., over a packet network, such as a network using IP.
Multimedia multipoint conferences that include audio and video, commonly called multimedia teleconferences and videoconferences, are becoming more and more widespread. A multipoint videoconference allows three or more participants at a plurality of locations to establish bi-directional multimedia communication including audio and video, while sharing the audio-visual environment, in order to give the impression that the participants are all at the same place.
Packet networks, in particular IP-based packet networks are increasingly popular for multimedia conferences. Recommendation H.323 titled “Packet-based multimedia communications systems” (International Telecommunication Union, Geneva, Switzerland) describes the technical requirements for multimedia communications services in a packet-switched network. The packet-switched networks may include local area networks (LANs), wide area networks (WANs), public networks and internetworks such as the Internet, point-to-point dial up connections over PPP, or using some other packet-switched protocol. The invention is described herein using International Telecommunication Union (ITU, ITU-T) Recommendation H.323. The invention, however, is not limited to H.323.
In a multipoint videoconference, a generally desirable feature is the ability to simultaneously view more than one site other than the viewer's site, optionally including the viewer's site. This feature is referred to herein as video mixing without regard to how many sites are visible at a time. A continuous presence videoconference is one that includes mixing, i.e., one in which the user has the capability to simultaneously view more than one site other than the viewer's site on a terminal's display.
In a continuous presence videoconference with video mixing, it is desirable for individuals at the participating terminals to be able to control the layout of their displays, and to select the other participant or participants being communicated with, e.g., for display or other purpose from the set of active participants. It also is desirable for a participant to be able to control the video camera of a selected participant e.g., at a remote site, or e.g., at the local site so that the participant can have control of the view. This ability is called Far End Camera Control (FECC). Typical controls include pan, tilt, and zoom controls, e.g., “pan-right”, “pan-left”, “tilt-up”, “tilt-down”, “zoom-in,” and “zoom-out. It further is desirable for the participant to select which camera, e.g., which participant's camera is being controlled.
By layout control, we mean the selection by a particular participant of the manner of switching, i.e., how many participants are being displayed and the layout on the particular participant's video screen. By participant selection, we mean the selection by a particular participant of one of the videoconference participants. The participant selection may be in order to select a layout feature for the selected participant such as full screen or an enlarged view for the selected participant. The participant selection may also be for participant-to-participant camera control in order to control the camera at the site of the selected participant.
Extensions to participant-to-participant camera control also are known. For example, there may be some extensions Annex Q of H.323 that may be supported by one terminal but not by another. For example, suppose terminal A has a microphone in the camera and this capability can be signaled and controlled using Annex Q version 2 which both terminal A and the MCU support. Consider a videoconference with terminals A, B, and C centrally controlled over a packet network. Suppose terminal B does not support the extensions to Annex Q of H.323 that permit the microphone of a camera to be controlled. There is a need in the art for a mechanism that provides for a terminal of a multipoint conference to control the microphone, e.g., to mute the microphone of another participant of the multipoint conference.
In this description, we include extended camera control such as controlling, e.g., muting, the microphone of another participant in the term “participant-to-participant camera control.”
Limited participant and layout selection is known for prior art multipoint videoconferences over ISDN networks. ITU-T Recommendation H.320 titled “Narrow-band visual telephone systems and terminal equipment” is known for ISDN videoconferences, including multipoint videoconferences. ITU-T Recommendation H.243 for use in H.320 systems defines procedures commonly known as “Chair Control” that allow one person in a multipoint videoconference to control who is being shown at the other participants' displays and to select one of a few available layouts for display. Thus, H.320 provides some limited means of participant and layout selection for a multipoint videoconference. However, in the case of packet networks, typical videoconference terminals, e.g., for videoconferences conforming to H.323, are not conference aware. That is, the terminal is not aware that it is in communication with an MCU rather than with another terminal. Thus, there is no standard way to address the other conference participants. Furthermore, there is no standard way to carry out participant selection.
Note that H.323 uses ITU recommendation H.245 titled “Control Protocol for multimedia communication” for control messaging, and H.245 does provide a mechanism for a terminal that is conference aware to obtain, e.g., from an MCU a list of terminals labels that the terminal can currently see. Using such information, it is conceivable that terminal can allow the user to select the participant's whose camera it wants to control. However, most commonly used terminals today do not implement this part of H.245 as they are not conferencing aware. For such devices, communicating with an MCU is the same as communicating with another endpoint. Thus there still is a need for a methods for participant selection, and for participant-to-participant camera control that do not require end devices to be aware that they are communicating with an MCU rather than with another terminal.
Far end camera control is suggested for packet network-based multipoint videoconferences, but not in a standard way. For example, Annex Q of ITU-H.323 titled “Far-end camera control and H.281/H.224” defines a protocol H.281/H.224 based Far End Camera Control (FECC) applicable to packet-based networks. ITU Recommendation H.281 Defines FECC using protocols that conform to ITU Recommendation H.224. ITU Recommendation H.224 titled “Real Time Control protocol for simplex applications using the H.332 LSD/HSD/MLP channels” provides a simple yet flexible protocol for simplex, low delay applications. Using these protocols, FECC is available in packet networks for endpoint-to-endpoint communication. However, as discussed above, there is no standard way to address individual endpoints for participant-to-participant camera control—or extended control, e.g., of a microphone—in a packet-based H.323 multipoint videoconference of terminals that are not conference aware.
Thus there is a need in the art for participant selection in a multipoint videoconference on a packet-based network. Thus there also is a need in the art for layout selection in a multipoint videoconference on a packet-based network. Thus there also is a need in the art for participant-to-participant camera control by a local participant for a selected remote participant in a multipoint videoconference on a packet-based network.
In particular, there is a need to provide one or more of these capabilities on-the-fly during a videoconference, i.e., in a continuous presence mode.
Prior art mechanisms are available that provide some form of participant selection and/or layout selection and/or participant-to-participant camera control of a remote participant in a packet network-based multipoint videoconference.
One prior-art method is to multicast the videoconference. Multicasting differs from a centralized configuration via a Multipoint Control Unit (MCU) in which all participants of the videoconference establish communication with the MCU and communicate with other participants via the MCU. In a centralized configuration, the MCU ensures that multipoint videoconference connections are properly set up and released, that audio and video streams are properly switched and/or mixed, and that the data are properly distributed among the videoconference participants. An alternate to using an MCU is to have a distributed configuration that includes multicasting the videoconference. That is, each and every participant sends messages to all other participants. Some form of participant selection and/or layout selection and/or participant-to-participant camera control of a remote participant is available in such a non-centrally controlled videoconference. However, to use such mechanisms the end device has to be conference-aware.
Another prior-art mechanism for providing some form of participant selection and/or layout selection and/or participant-to-participant camera control in a packet network-based videoconference uses a Web-based service. RADVISION Ltd., Tel-Aviv, Israel, for example, has an MCU product that includes Web-based monitoring and control for configuration and setup from any location using a web browser. The RADVISION web interface provides real-time videoconference control capabilities and three types of user access (administrator, videoconference manager and user). Continuous presence mode enables an enhanced and simultaneous view of videoconference participants with a choice of different layouts, e.g., 16, 1+12, 2+8, 3+4, 4 or 1. Using the Web-based interface, the videoconference manager in a RADVISION based multipoint videoconference can dynamically change the video layout during a videoconference call with dynamic “On-the-Fly” layout control.
However, there is no standard mechanism in the prior art that provides these services with standard videoconference terminals in a centrally controlled, e.g., MCU-based, videoconference over a packet-based network, e.g., an IP-based network.