With the development of broadband networks, mobile communication covers more than the traditional voice communication, and provides multimedia services that combine audio, video, images and texts. By integrating the data services such as the presence service, Short Message Service (SMS), web browse, positioning information, Push service, and file sharing, the operator can meet diversified requirements of the user. Impelled by multiple applications, the 3GPP launches an IP-based Multimedia Subsystem (IMS) architecture, which implements miscellaneous multimedia applications and provides more choices and richer experience for users.
The multi-party service is a service form based on the IMS architecture. For example, the multi-party service can be implemented on a Push to Talk over Cellular (PoC) system or a conference system. The PoC system is a multi-party multimedia communication system under centralized control. The PoC service adopts the half-duplex communication mode, implements point-to-point or point-to-multipoint voice communication, and enables only one participant to speak at a time to facilitate group communication. Once pressing a key, the calling party can originate a conversation with a person or a group, without dialing a number or waiting for the opposite party to go off-hook. The call is put through promptly, and a conversation group is set up quickly. The conference service is a web-based telephone service oriented to different types of web conferences. A user may attend a web conference through a soft terminal, an ordinary telephone set, or a Session Initiation Protocol (SIP) hard terminal and a Mobile Station (MS). The chairman of the conference reserves a conference through web pages and manages the conference in real time. The attendees view the conference information through web pages. An attendee may attend a conference in either convergent or diffusive way. A conference member may originate a subconference during a conference. A subconference enables the attendees to discuss in groups. A request of originating a subconference is submitted to the conference chairman through web pages. After being approved by the chairman, the subconference is put through.
In a multi-party service, the media sending right (“talk burst”) of members should be managed because only one user is allowed to speak at a time. In a communication system based on media stream/media stream control, for example, in a PoC system or a web conference system, different types of media streams are distributed and controlled on the multimedia control entity that controls the session, including negotiation and acquisition of the talk burst. For example, after a session is established in a PoC system, a user may apply for the talk burst (also known as “speaking right”) on the multimedia session terminal (PoC terminal) in the process hereinafter.
First, the multimedia session terminal applies for the talk burst from the multimedia control entity (for example, PoC server) through a “Talk Burst Request” message based on the Talk Burst Control Protocol (TBCP); the PoC server returns a “Talk Burst Granted” message to the applicant, telling the applicant that he/she is allowed to speak; meanwhile, the PoC server sends a “Talk Burst Taken” message to other users, notifying other members of the group of the information about the current speaker. The multimedia session terminal that obtains the talk burst begins to speak (namely, send media streams). The media streams are forwarded by the PoC server to other members in the group. Upon completion of speaking, the multimedia session terminal releases the talk burst. When the talk burst of the group is idle, the PoC server broadcasts a “Floor Control Idle” message to the group members. The PoC system under the prior art supports the “Talk Burst Request Queue” function. Namely, when more than one multimedia session terminal applies for the talk burst, the PoC server performs arbitration, approves only one of the applicants to hold the talk burst, and refuses the requests from other applicants or inserts the requests into a Talk Burst Request Queue. After the current speaker releases the talk burst, the PoC server selects a requester from the queue according to a certain policy (for example, by priority) and grants the talk burst to him/her.
FIG. 1 shows how a multimedia processing entity processes a multimedia stream request under the prior art. In the figure, different types of media streams are divided into processing entities of several media types on a multimedia control entity (for example, SIP server). Each media type is controlled and processed by the processing entity of this media type. FIG. 1 shows two types of processing entity: type 1, and type 2. In the prior art, the processing entities of multiple media types serve as logic entities, and are not associated with each other.
Due to existence of the talk burst request queue, before processing the requests of talk bursts of a media type, it is necessary to wait in the talk burst request queue of the specific media type until the request of talk bursts is processed by the entity in charge of processing the requests of talk bursts.
Under the prior art, multiple media types or a combination of multiple media types is used to negotiate and control one or more types of media streams, and the control entities work independently of each other, which tends to cause conflict between media types in a multimedia environment. For example, a multimedia user (multimedia session terminal) may apply for a media processing entity that handles voice streams and another media processing entity that handles the video streams mixed with voice (“audio and video streams”). In a multi-party service environment, every user in the session may use a media processing entity of voice streams and a media processing entity of audio and video streams to apply for the talk burst streams.
From the perspective of the media sender: In a voice session, if an audio and video session exists, one multimedia session terminal may obtain two talk bursts. In this case, the talk bursts are independent between different media types, and two voice controls are independent of each other. If multiple voices are sent by a terminal at a time, the PoC session will be chaotic and the user experience will be poor.
Moreover, while a multimedia session terminal obtains the right of sending ordinary voice sessions and is under a voice session, if another multimedia session terminal obtains the right of sending audio and video sessions, namely, both of the two multimedia session terminals hold the voice-related talk bursts, when the two terminals send voice simultaneously, other users in the session will hear the voice from two multimedia session terminals at a time. This leads to poor user experience in the session. As for the two multimedia session terminals, while they are speaking, they hear the voice from opposite multimedia session terminal, session. The concurrence of multiple voices in one session is not allowed in many scenarios, and conflicts with the habit of the multimedia multi-party service.
Therefore, multiple control entities that handle multimedia streams working independently may lead to conflict, namely, multiple voices occur in one session, and the control entity that controls voice streams is unable to control the work of other control entities allowed to send voice streams.