In order to achieve access independence and to maintain a smooth interoperation with wired terminals across the Internet, an Internet Protocol Multimedia Subsystem (IMS) core network, as specified e.g. in the 3GGP (Third Generation Partnership Project) specification TS 23.228, has been developed to be conformant to IETF (Internet Engineering Task Force) “Internet Standards”. The IMS enables network operators of mobile or cellular networks to offer their subscribers multimedia services, based on and built upon Internet applications, services and protocols. The intention is to develop such services by mobile network operators and other third party suppliers including those in the Internet space using the mechanisms provided by the Internet and the IMS. The IMS thus enables conversion of, and access to, voice, video, messaging, data and web-based technologies for wireless users, and combines the growth of the Internet with the growth in mobile communications. In IMS the Session Initiation Protocol (SIP) is used as the main session control protocol between end user equipments and Call State Control Functions (CSCFs) located in the IMS. SIP enables network operators to provide new features for end users such as dialing with the use of SIP Uniform Resource Indicators (SIP URIs).
For example IETF is working on a SIP conferencing service. The goal is to define how conferencing type of services can be established between terminals, which can be used as a SIP user agent. To this end, an XCON working group has been in IETF, which is responsible for developing standardized suite of protocols for tightly coupled multimedia conferences. As part of the XCON working group protocols for conference control, media control and floor control would be developed and standardized. Multimedia conferences may include any combination of different media types.
In multiparty conferencing, media control protocol enables each participant of the conference to choose the media stream it wants to hear or view. This enhances the end users' experience in multiparty conferencing in that they can view or hear a particular participant of the conference. Each participant of the conference is allowed to send requests to the conferencing server requesting to view a particular participant of the conference or to view multiple participants of the conference in desired layout format like mosaic or continuous presence mode. When the conferencing server receives such a request to view multiple participants at the same time from an end user or end point, it constructs a composite video frame of multiple participants and sends out the video frame to the end participant.
However, the end participant has no knowledge of participants that are provided in the composite video frame. For example, if the conferencing server sends a 2×2 video frame of four different participants to a particular participant of the conference, the endpoint cannot determine where the participants are located in that composite video frame. There is thus no way for an end user to display at its screen titles allocated to individual images in the composite video frame. Hence, presently in multiparty video conferencing systems of circuit switched or packet switched networks no mechanism is given to a participant to request or specify a location of a particular participant's video or multiple participants' video in a composite video frame for the conferencing server. Currently, the conferencing server sends a composite video frame, for example in a continuous presence mode, if the conferencing server is configured in this particular manner. Besides this, there is no other way a participant can request particular video streams in a particular format or order from the conferencing server.