Fixed and mobile telephones have so far been used mainly for making voice calls. The service of communicating limited text messages between mobile terminals, such as SMS (Short Message Service) messages, is also available. These are fairly straightforward telecommunication services which use well-established technologies under more or less fixed prerequisites. In the standardized communication protocols used for calls between fixed and/or mobile terminals, predefined sets of communication rules and parameters are typically used, which are known by the terminals and within their capabilities. Hence, it is presumed that both a calling terminal and a called terminal are capable of conducting the call based on such fixed communication parameters valid for each terminal. Therefore, a traditional call, such as a simple voice call, can be established quite fast since both terminals “know” in beforehand what parameters to use, e.g., concerning transmission and encoding schemes. No procedure is thus needed to determine which rules and parameters to use before a call can be established and executed.
A multitude of different telephony services are now being developed, which will be possible to employ in particular as new technologies for communication are introduced, providing greater network capacity and higher transmission rates. For example, GPRS (General Packet Radio Service) and WCDMA (Wideband Code Division Multiple Access) technologies are currently emerging for enabling wireless telephony services requiring a wide range of different data rates.
Some new services involve real-time transmission of video information as well as audio information, and may further include the transmission of added data representing text, documents, images, audio files and video files in various different formats and combinations. Such services are generally referred to as “multimedia” services, which term will be used in this description to represent any telephony services that involve the transfer of information in addition to ordinary voice, thereby requiring the determination of session parameters.
A great number of sophisticated new mobile terminals are also becoming available on the market which are equipped with functionality to match the new services. As a result, the terminals will have a multitude of different capabilities with respect to, e.g., codecs (coders/decoders), presentation functionality and transmission rates. The term “terminal” will be used in this description to broadly represent any type of communication station, or a group of terminals in conference using a Multipoint Conference Unit (MCU) which will represent the group of terminals in this context.
A problem that inevitably arises is that the prerequisites for each specific session using multimedia services will no longer be fixed and known in beforehand, but will vary depending on the invoked service and the capabilities of the calling and called terminals, respectively, as well as other factors. During a session, certain so-called session parameters must be used by both the calling and called terminals in order to communicate the desired information. Such session parameters define the rules of communication and may be related to available codecs and multiplexing schemes, which will be described in more detail below.
The session parameters may further depend on predefined user preferences and subscription terms, which may be tailor-made for each subscriber or defined for specific groups of subscribers. In order to establish a session between two terminals involving multimedia services, the session parameters must therefore first be selected and determined in a session setup procedure, before the actual session or call can begin and use those session parameters.
FIG. 1 illustrates schematically a typical communication scenario between two terminals A and B. In this case, terminal A is a mobile telephone being wirelessly connected to a mobile access network 100, e.g., a WCDMA network. On the other side, terminal B is a fixed telephone being connected to a fixed access network 102, e.g., a PSTN (Public Switched Telephony Network). The two access networks 100 and 102 are in turn connected to a general “backbone” network 104, which in practice may be any type of communication network, or combination of different networks. It is assumed in this example that the networks 100, 102 and 104 use more or less known transport techniques, and therefore need no further description in this context.
In the present example, terminal A calls terminal B in order to have a video session involving two-way transmission of both video and audio information. Each terminal A, B is equipped with a viewing screen Sa and Sb, respectively, and both are capable of communicating and presenting real-time video and audio. In that respect, the capabilities of the terminals A and B are fairly similar. However, they will most likely have different capabilities regarding codecs and multiplexing, as explained above, and each terminal has no knowledge of the other. Therefore, the terminals A, B must exchange information regarding their specific capabilities and preferences, in order to negotiate and agree on suitable common session parameters that both can use during the forthcoming call session. In particular, the terminals must select coding/decoding schemes (i.e. codec types), and agree on a multiplexing scheme for mixing different data streams for video and audio information on a given physical channel, such that the available bandwidth is utilized in a suitable way.
H.324 is a standard defined by the International Telecommunication Union Telecommunications Sector (ITU-T) for multimedia telephony involving real-time video and audio. H.324 has been designed to handle such communication in a flexible way between terminals having differentiated capabilities, and also allowing the use of a great variety of different services. In particular, a specification called 3G-324M has been defined, based on H.324, to support real-time communication of wireless multimedia services over existing circuit-switched wireless networks. Although the technology disclosed herein is not limited or restricted by any procedures specified in H.324, this standard will be referred to as an example of how a multimedia call can be established according to the technology disclosed herein.
Thus, before a video call between terminals A and B can begin, a communication session must be established and the session parameters to use in the call must be determined. According to H.324, establishing a communication session is divided into two procedure parts including a bearer setup phase and a session setup phase.
In the bearer setup phase, a physical communication channel is reserved throughout the communication path between the terminals A, B in both directions. The physical channel may be similar or different in the two directions, depending on whether the call is symmetric or asymmetric. A physical end-to-end channel typically comprises a series of paths through different intermediate networks, e.g. radio channels and/or fixed circuit switched voice or data channels. The details of the bearer setup phase will not be described here further, however, since they do not concern the present invention.
When the physical channel has been established, the session setup phase can be executed, which is performed only by the two terminals, without involving any intermediate network node. The session setup phase is executed in order to determine the above-mentioned session parameters that both terminals are capable of using during the call session. Hence, it is entirely up to the terminals how to utilize the given physical channel. The session setup phase typically comprises several steps, such as: 1) exchange of terminal capabilities, 2) master-slave determination, 3) selecting a multiplexing scheme, and 4) opening of logical channels. These procedure steps, basically as dictated by the H.324 standard, will now be briefly described with reference to the flow chart in FIG. 2.
In a first step 200, terminal capabilities are exchanged where each terminal sends to the other terminal at least a list comprising the codec types and a set of multiplex parameters that the terminal can handle, thereby advertising its capabilities. In H.324, such information is sent in a “TCS” (Terminal Capability Set) message, and each receiving terminal must acknowledge receipt thereof. This message can be sent again at any time during the session for updating terminal capabilities.
Master-slave determination is a necessary procedure for appointing one terminal as master and the other terminal as slave, in a next step 202, e.g. in order to avoid signalling conflicts in the communication dialogue during the session setup. According to H.324, each terminal generates a 24-bit random number called “SDN” (Status Determination Number) which is transmitted in an “MSD” (Master-Slave Determination) message, which must be acknowledged as well by the receiving terminal. A comparison of the two SDNs then unambiguously decides the master-slave appointments, according to some predefined rule. The master-slave appointments may also be used during the actual session as well.
A plurality of multiplexing schemes have preferably been defined to control how plural information streams can be multiplexed in different ways into a single bitstream to be transmitted over the physical channel established in the bearer setup phase described above. A video call typically requires at least three separate information streams for audio, video, control information and optionally other data, respectively, each requiring at least one logical channel. The ratio between the different streams can be varied dynamically, depending on the needs for transmission in each stream, in order to optimally utilize the available bandwidth, i.e. the given physical channel. For example, H.324 uses a multiplexing standard called H.223 which defines different multiplex tables controlling the allocation of various streams of audio, video, data and control information in predefined data sequences called packets. Any number of logical channels may be used, as specified by the multiplex table.
Each packet may contain a variable pattern of logical channel allocation into bit positions within the packet, and the channel allocation may be different in each successive packet. The packet length can also be varied. The channel allocation scheme for each specific packet is determined by a selected multiplex table entry as indicated by a short index number included in a header of each packet. Then, it is not necessary to transmit any further overhead information regarding the multiplexing. However, the multiplex packet structure must first be defined for each index number during the session setup phase.
Thus, following the master-slave determination step 202, suitable multiplexing schemes are selected in a next step 204, when the terminals negotiate and agree on a multiplex table configuration to use during the forthcoming session. According to H.324, each terminal then sends a so-called “MES” (Multiplex table Entry Send) message, comprising a list of index numbers and the respective packet structure definitions. The receiving terminal must also acknowledge or reject each proposed index and packet structure in a responding MES message. New and updated multiplex tables may also be sent in a further MES message at any time during a session. If a packet is received having an undefined index number, that packet will be discarded by the receiving terminal.
Finally, in a step 206, all logical channels needed for the invoked service or services are established or “opened” according to terminal capabilities which are common to both terminals. Preferably, a highest priority codec that both terminals can use for each specific media stream during the session is selected for that stream. According to H.324, one or both terminals send a so-called “OLC” (Open Logical Channel) message to the other terminal containing one or more proposed codecs, preferably with indicated priorities, with respect to the TCS message received from the other terminal in step 200. Each receiving terminal may then accept or reject the proposed codec or codecs, depending on its own capabilities and/or preferences. When the terminals have finally agreed to use a specific codec, or set of codecs, corresponding logical channels are established, and the actual session or video call can begin.
The above-described example illustrates how certain communication conditions or terms, as defined by session parameters, can be determined before a call session can be executed. It should be noted that the order of steps 202 and 204, as well as the order of steps 204 and 206, respectively, may be reversed depending on the implementation. The term “session parameters” is used here to generally represent any specifics determining how information should be communicated and interpreted. The example described above was focused on session parameters related to codecs, and multiplexing schemes. However, other important session parameters may be required, such as a parameter relating to error correction/protection which is typically included in the OLC message according to a standard H.245, which is a part of the H.324 standard.
However, it takes some time to execute the above-described bearer setup and session setup procedures. The bearer setup phase duration has been measured to be in the range of 7 to 14 seconds for establishing a call between two mobile terminals, but can probably be reduced to approximately 5 seconds if the presently available methods are made more efficient. The session setup phase duration has been measured to be in the range of 4 to 7 seconds for existing products. Since the session setup phase takes place after the bearer setup phase, the total delay before the call can begin will actually be at least in the range of 9-21 seconds. These long delays are a considerable drawback, since they reduce the attraction of multimedia services. The delays become even more tiresome if the service mode is changed during an ongoing session, such as when repeatedly switching between video mode and voice-only mode. The setup procedure must then be repeated at each switching of service modes.
Hence, it is generally desirable to minimise delays imposed by session establishment. It is difficult to reduce the duration of the session setup phase without substantially revising the standard, since it includes many different steps that must be executed consecutively, such as the steps illustrated in FIG. 2, involving several round-trip delays, among other things. This phase can be further delayed if the quality of the established and currently used physical channel is bad, resulting in bit errors in the transmitted data and the need for retransmissions. In particular, messages containing terminal capabilities, such as the TCS message in H.324, are typically quite long and will cause considerable delay if retransmitted. Such long messages can be divided into several segments that may be retransmitted separately.
In general, similar problems may exist for any type of session setup where the channel carrying the signalling messages is either subject to long round-trip delays, or have a narrow bandwidth compared to the amount of information transferred, or both, in combination with requiring plural round-trips to establish or re-establish the session. One example of another specification for session setup where these problems also may occur is SIP, “Session Initiation Protocol” (IETF RFC 3261 et al.). SIP is an application-layer control (signalling) protocol for creating, modifying and terminating sessions with one or more participants. These sessions include Internet multimedia conferences, Internet telephone calls and multimedia distribution.
Hence, a solution is needed for reducing the current long delays involved with the establishment of sessions requiring the determination of parameters, e.g. in multimedia calls. In particular, it is desirable to still use presently defined routines and standards, not requiring any new standard specifications and preferably using existing sets of signalling messages.