1. Field of the Invention
The present invention relates to a method of controlling an establishment of a multimedia call between communication equipments in different network environments, a corresponding system, a corresponding computer program product, and a corresponding network control element. In particular, the present invention relates to method, system, computer program product and network control element by means of which a multimedia call, for example a video call, originating from a first network environment, such as an IP multimedia subsystem, to a receiving party being located in a second network environment, such as a circuit switched (CS) communication network system, can be successfully established.
For the purpose of the present invention to be described herein below, it should be noted that                a communication equipment may for example be any device by means of which a user may access a communication network; this implies mobile as well as non-mobile devices and networks, independent of the technology platform on which they are based; only as an example, it is noted that communication equipments operated according to principles standardized by the 3rd Generation Partnership Project 3GPP and known for example as UMTS terminals are particularly suitable for being used in connection with the present invention;        although reference was made herein before to video call, this exemplifies only a specific example of content; content as used in the present invention is intended to mean multimedia data of at least one of audio data, video data, image data, text data, and meta data descriptive of attributes of the audio, video, image and/or text data, any combination thereof or even, alternatively or additionally, other data such as, as a further example, program code of an application program to be accessed/downloaded;        method steps likely to be implemented as software code portions and being run using a processor at one of the entities described herein below are software code independent and can be specified using any known or future developed programming language;        method steps and/or devices likely to be implemented as hardware components at one of the entities are hardware independent and can be implemented using any known or future developed hardware technology or any hybrids of these, such as MOS, CMOS, BiCMOS, ECL, TTL, etc, using for example ASIC components or DSP components, as an example;        generally, any method step is suitable to be implemented as software or by hardware without changing the idea of the present invention;        devices or means can be implemented as individual devices or means, but this does not exclude that they are implemented in a distributed fashion throughout the system, as long as the functionality of the device is preserved.        
2. Related Prior Art
In the last years, an increasingly extension of communication networks, e.g. of wire based communication networks, such as the Integrated Services Digital Network (ISDN), or wireless communication networks, such as the cdma2000 (code division multiple access) system, cellular 3rd generation (3G) communication networks like the Universal Mobile Telecommunications System (UMTS), cellular 2nd generation (2G) communication networks like the Global System for Mobile communications (GSM), the General Packet Radio System (GPRS), the Enhanced Data Rates for Global Evolutions (EDGE), or other wireless communication system, such as the Wireless Local Area Network (WLAN), took place all over the world. Various organizations, such as the 3rd Generation Partnership Project (3GPP), the International Telecommunication Union (ITU), 3rd Generation Partnership Project 2 (3GPP2), Internet Engineering Task Force (IETF), and the like are working on standards for telecommunication network and multiple access environments.
In general, the system structure of a communication network is such that one party, e.g. a subscriber's communication equipment, such as a mobile station, a mobile phone, a fixed phone, a personal computer (PC), a laptop, a personal digital assistant (PDA) or the like, is connected via transceivers and interfaces, such as an air interface, a wired interface or the like, to an access network subsystem. The access network subsystem controls the communication connection to and from the communication equipment and is connected via an interface to a corresponding core or backbone network subsystem. The core (or backbone) network subsystem switches the data transmitted via the communication connection to a destination party, such as another communication equipment, a service provider (server/proxy), or another communication network. It is to be noted that the core network subsystem may be connected to a plurality of access network subsystems. Depending on the used communication network, the actual network structure may vary, as known for those skilled in the art and defined in respective specifications, for example, for UMTS, GSM and the like.
Generally, for properly establishing and handling a communication connection between network elements such as the communication equipment and another communication equipment or terminal, a database, a server, etc., one or more intermediate network elements such as control network elements, support nodes or service nodes are involved.
One application whose importance for current and future communication systems increases are multimedia communication services. A multimedia call is a communication where, for example, sound (voice), text and picture are used simultaneously. Multimedia calls generally require the transmission of several different types of data (video, audio, and the like) in parallel, and these data are to be transmitted and received by various different types of communication equipments or network elements, so that it is required that plural communication protocols are negotiated and appropriate parameters for the communication are adjusted.
In 3G networks, it is mandated by the 3GPP to use a 3G bandwidth guaranteed circuit switched bearer. Furthermore, as the standard to be used for such a multimedia communication, a 3G-324M system is to be employed. The 3G-324M system represents a derivate of the ITU-T H.324 protocol which in turn requires the employment of several further components or protocols. The general procedures for establishing a multimedia communication are known for those skilled in the art so that a detailed description thereof is omitted herein.
A current technology to merge the Internet with the cellular telecommunication world is the Internet Protocol (IP) Multimedia Subsystem IMS. The goal is to make available services offered by the Internet nearly everywhere by means of cellular mobile communication systems. IMS is introduced as part of the 3GPP standards since Release 5. As a part of the signaling mechanisms used between the IMS and an user equipment the Session Initiation Protocol SIP is used. Details of the structure and procedures executed in IMS are described in the related standards and are commonly known to a person skilled in the art so that a description thereof is omitted herein for the sake of simplicity.
It is expected that current circuit switched networks evolve towards the IMS in the coming years. Thus, for a relatively long period of time both CS networks and IMS will be used side by side. Hence, it is necessary to ensure interworking between the both systems so that an end user experience is not jeopardized. For example, when Voice over IP (VoIP) calls are available in IMS then the interworking between IMS and CS networks for voice calls must be possible. This type of interworking has been specified by 3GGP standardization bodies and can thus be implemented according to those standards.
However, in case of multimedia calls, in particular of video calls, the situation is different. Such multimedia or video calls are an important feature of newer 3G networks. Nevertheless, there is currently no agreed standards for a video call interworking between IMS and CS core networks. For example, a video call uses the H.324M protocol suite in CS core networks while SIP is used in IMS system. It is difficult to establish a proper interworking between SIP and H.324M, especially in calls from the IMS domain that have not been addressed in any standards.
Many of the interworking cases can be established with direct SIP SDP (Session Description Protocol) and 3G UE terminal capabilities mapping but in some cases the IMS originated call arrives at the CS core network without an initial description of media supported by the calling party, for example in the form of a SDP description. In such a case, the CS core network is not able to determine whether a speech call or a video call is to be established.
In other words, when a SIP terminal does not provide its media capabilities in the initial phase of the call, for example in an INVITE message that it sends towards the CS core network, difficulties arise to determine which type of call is to be established. The missing of information on the media capability, i.e. the missing of the SDP description in the initial messaging, can occur, for example, in case of a so-called third party call control as described in RFC3725. Another possibility is when a 3G-H.323M terminal originated flow is routed back from the IMS domain to the CS network, for example due to a call forwarding which is a basic supplementary service. Furthermore, a SDP may missing when SIP is used to bridge between ITU-T based multimedia calls. It is to be noted that there are conceivable also other cases where an initial media capability indication is not received at the CS network from the IMS side.
When the media capability information is not received, the CS core network (i.e. the respective control element like a Mobile Switching Center MSC) does not know whether the SIP client which is the calling party, is willing or capable of making a video call. The CS core network control element has no means to know what kind of a call is to be established towards the 3G UE.
In FIG. 6, a signaling diagram is shown which illustrates a conventional call establishment between a calling party located in the IMS to a receiving party located in a circuit switched 3G network. The call establishment is controlled by a network control element, for example a MSC Server (MSS) which comprises a Media Gateway Control Function (MGCF). The MGCF is a gateway which enables communication between IMS and CS users. It is to be noted that the signaling diagram in FIG. 6 is showing the procedure in a simplified manner. As known by those skilled in the art, there are several other network elements and additional signaling messages involved in the call establishment control.
According to FIG. 6, the IMS sends an INVITE message (M30) towards the CS core network, i.e. to the MSS/MGCF, for initializing the call. As mentioned above, the INVITE message does not contain any SDP descriptor for defining the media supported by the SIP terminal in the IMS and the desired call mode (for example, speech or voice). Thus, the MSS/MGCF is not aware about the type of the call to be established. Then, with message M31, the MSS/MGCF sends a SETUP message comprising a Bearer Capability information indicating speech (BCSP) to the 3G user equipment. The 3G UE responds with message M32 CALL CONFIRMED in which the bearer capability (speech) which is supported by the 3G UE is confirmed. In the following messages M33 ALERTING, M35 CONNECT, and M36 CONNECT ACK between the 3G UE and the MSS/MGCF as well as messages M34 180 RINGING, M37 200 OK (SDP), and M38 ACK between the MSS/MGCF and IMS, a speech call is established by the MSS/MGCF for the call irrespective of whether or not the SIP terminal, i.e. the calling side, wanted such a call to be setup. In other words, by means of the conventional call establishment control, it is not possible to establish a video call even if the calling party intended to do so and the receiving party is able to perform such a call.
As mentioned above, conventionally only a speech call is always established. Alternatively, the call may also be torn down which is no desirable solution. This means, when speech call is always selected as the call mode, then a video call is not possible to be established. However, if a video call is established instead and it later becomes evident that the SIP terminal wanted to establish a speech call, or in case the SIP terminal only supports speech call, then the video call in the CS core can only carry the speech component while resources for a video call are also reserved (and charged). This means an expensive call for the receiving party especially if the called party is roaming. Since such a high charging is no desired option, the only way is to always make a speech in this kind of call case.