When more than two people participate in a communications session, the session becomes a conference. Conference applications often have shared resources, such as the right to talk, the right to access input into a limited-bandwidth video channel, and/or the right to control a pointer or to otherwise focus attention in a shared application. In many cases, it is desirable to be able to control who can exercise these rights to the shared resource.
This situation is analogous to the ancient problem of running a town meeting, where certain rules must be developed to control access to the floor; these rules are historically known as rules of parliamentary procedure. Similar floor control rules are needed for telecommunications conferencing.
Floor control enables applications or users to gain safe and mutually exclusive or non-exclusive input access to the shared object or resource, and is generally an optional feature for conferencing applications. Floor control has been studied extensively over the years. See, for example, “Efficacy of Floor Control Protocols in Distributed Multimedia Collaboration” by H. Peter Dommel and J. S. Garcia-Luna-Aceves, in Cluster Computing Journal (1999). Floor control has also been the subject of numerous documents of the Internet Engineering Task Force (IETF).
We define a floor as the temporary permission for a conference participant to access or manipulate a specific shared resource or group of resources. Session Initiation Protocol (SIP) conferencing applications may decide not to support this feature at all. Some applications of floor control, such as write access to a shared document, are useful even for “conferences” with two members, while other resources, such as an audio channel, may only make floor control worthwhile for larger groups.
In general, floor control is closely related to the management of shared resources in operating systems and distributed systems. Synchronization, mutual exclusion and the reader-writer problem have become standard tools in those areas. However, floor control differs in that it generally involves managing access by human participants, with a much stronger emphasis on policies.
A focus is defined as an SIP user agent that is addressed by a conference Uniform Resource Identifier (URI). The focus maintains an SIP signaling relationship with each participant in the conference. The focus is responsible for ensuring, in some way, that each participant receives the media that make up the conference. The focus also implements conference policies. The focus is a logical role.
A floor is defined as a set of shared resources within a conference; a single conference may have multiple floors. A conference member is a member or participant that has a signaling relationship with the conference focus, and receives one or more of the media streams that are part of the conference. A conference owner is a privileged user who defines rules for running the conference; by default, the conference creator becomes the owner, but the role can be delegated to another entity. Among other roles, the conference owner also establishes rules for floor control, by creating floors, and assigning and removing floor chairs. The conference owner may delegate some of these responsibilities to another party. The conference owner does not have to be a member in the conference.
A chair is normally a person who manages one floor by granting, denying, or revoking access privileges. The chair does not have to be a member of the conference. The chair is sometimes also referred to as the moderator. Different floors within a conference may have different chairs, and chairs may change during a conference. A conference client will therefore be either an ordinary member, or alternatively will be a chair.
As mentioned, floor control is a mechanism that enables applications or users to gain safe and mutually exclusive or non-exclusive access to the shared object or resource. A floor controller is a logical entity that manages floors. It receives requests from conference participants, the conference owner and the floor chair, and it issues protocol requests to affect conference and floor status. Depending on floor policy, the floor controller may ask the chair to approve decisions.
The floor policy is the set of rules and attributes governing operation of the floor controller. The floor policy is defined upon creation of a floor and may be modified by an authorized participant.
A floor control protocol is used to convey the floor control messages among the floor chairs (moderators) of the conference, the floor controller, the focus, and the participants of the conference. Floor control can operate at the origin of data, at a redistribution point, or at the destination. At the origin of data, floor control can ask the sender, via signaling, to suppress the generation of data. At the redistribution point, the floor controller can modify the mixing matrix, so that only media streams from certain participants are delivered to other participants. At the destination, floor control can filter incoming media or messages, so that only floor holders can affect the state of the shared resource.
We can distinguish between cooperative and coercive floor control. Cooperative floor control relies on the cooperation of the data source, while coercive floor control does not; it can function even if a participant is malicious or malfunctioning. Among the three locations of floor control, floor control at the redistribution point and at the receiver can be made coercive, while floor control at the sender is by necessity cooperative.
A floor is always coupled to one or more media session. A participant with appropriate privileges may create a floor by defining that one or more media session is now floor-controlled. As part of the creation of a floor, a chair needs to be appointed.
Typically, the ability of users to create floors is governed by the conference policy. In a simple scenario, the chair can delegate his or her responsibility to any other member of the conference. The conference policy and thus, indirectly, the conference owner defines whether or not floor control is in use and for which resources. If floor control is enabled for a particular resource or set of resources, the conference policy also defines for which resources the use of floor control is mandatory and for which it is optional.
Normally, the conference owner creates a floor using a mechanism and appoints the floor chair. The conference owner can remove the floor at any time (so that the resources are no longer floor-controlled), change the chair, or change the floor parameters. The chair controls the access to the floor, according to the conference policy.
SIP supports the initiation, modification, and termination of media sessions between user agents. These sessions are managed by SIP “dialogs,” which represent an SIP relationship between a pair of user agents. Because dialogs are between pairs of user agents, SIP's usage for two-party communications (such as a phone call), is relatively obvious. Communications sessions with multiple participants (i.e. conferencing) is more complicated.
SIP can support many models of multi-party communications. One, referred to as “loosely coupled conferences,” makes use of multicast media groups. In the loosely coupled model, there is no signaling relationship between participants in the conference. There is no central point of control or conference server. In another model, referred to as “fully distributed multiparty conferencing,” each participant maintains a signaling relationship with each other participant, using SIP. There is no central point of control; it is completely distributed amongst the participants. SIP does not yet support this model.
In a further model, sometimes referred to as the “tightly coupled conference,” there is a central point of control. Each participant connects to this central point. It provides a variety of conference functions, and may possibly perform media mixing functions as well. Tightly coupled conferences are not directly addressed by the SIP specification, although basic ones are possible without any additional protocol support.
FIG. 1 depicts how floor control integrates into the overall conferencing architecture. As mentioned, the “focus” is an SIP user agent that is addressed by a conference URI. The focus maintains an SIP signaling relationship with each participant in the conference. The focus is responsible for insuring, in some way, that each participant receives the media that make up the conference. The focus also implements conference policies. The focus is a logical role. Participants or “clients” are user agents, each identified by a URI, which are connected to the focus for a particular conference. A “conference policy server” is a logical function which can store and manipulate rules associated with participation in a conference. These rules include directives on the lifespan of the conference, who can and cannot join the conference, definitions of roles available in the conference and the responsibilities associated with those roles, and policies on who is allowed to request which roles. The conference policy server is a logical role. A “media policy server” is a logical function which can store and manipulate rules associated with the media distribution of the conference. These rules can specify which participants receive media from which other participants, and the ways in which that media is combined for each participant. In the case of audio, these rules can include the relative volumes at which each participant is mixed. In the case of video, these rules can indicate whether the video is tiled, whether the video indicates the loudest speaker, and so on. A “mixer” receives a set of media streams, and combines their media in a type-specific manner, redistributing the result to each participant. A “conference server” is a physical server which contains, at a minimum, the focus, but may also include a media policy server, a conference policy server, and a mixer. A “floor control server” is another term for “floor controller,” and is responsible for determining which participant(s) in a conference are allowed to speak at any given time, based on participant requests as well as access rules and the chair's decisions.
Many existing conference management protocols have already defined floor control functions. Floor control can be used to avoid or resolve conflicts among simultaneous media inputs. For example, at a given time, the moderator of a floor can ensure that only one person is heard by other participants or one person types into a shared document. The conference models can be centralized or non-centralized. In a centralized model, such as a tightly coupled conference already discussed, there is typically one conference server acting as the root of a conference. The root conference server receives all floor requests and can control the propagation of media in the conference directly or through sending requests to other conference servers in a tree topology. There is no such root conference server in the non-centralized model. The present application is primarily concerned with the centralized model. In the rest of this document, we simply use the term conference server to refer to the root conference server, as shown in FIG. 1.
The conference server needs to be able to control the shared resources. For example, the mixer in a conference server can selectively choose the media sources for mixing. The moderators and participants of the conference should be able to send “floor control commands” to the conference server to change floor status, and the conference server should notify the moderators and participants of changes.
A floor control protocol is used to convey the floor control messages among the moderator or moderators of the conference, the conference server and the participants of the conference. The floor control protocol does not deal with the conference management such as how to elect the moderator of the conference or how to add users to the conference.
Unfortunately, the current IETF floor control protocol concept enables only simple floor control models, i.e. it assumes that there is a floor chair deciding whether to accept a claim to a floor. Many useful floor policies are impossible with this scheme. For example, the push to talk (PTT) concept does not have chair control but instead has a very different floor policy. Another example that is incompatible with the current IETF concept is a floor policy according to which “senior-members can talk anytime overriding other speakers and users in floor claim queue.” Currently, the IETF floor control mechanism requires a floor chair specifically accepting or denying a user's claim to the floor. Therefore, other policies, such as “senior-members get 60 second talk time” or “user talking first time gets 30 seconds, loses floor after 3 second idle” are not possible. Similarly, as mentioned, a PTT type of floor policy used in audio chats is currently not possible.