The Push to Talk Over Cellular (PoC) service allows mobile users to create group sessions where participants can have voice and data communications on a one-to-one or one-to-many basis [1]. The voice communications are similar to walkie-talkie services where the terminals have dedicated ‘talk’ buttons. Only one person can talk at a given time and each talk burst is relatively short, for example, it lasts for a few seconds. Users can also exchange instant messages. Soon the talk bursts will evolve to bursts of voice and video streams, and the instant messages will contain rich media content such as audio, video, text, animation, etc.
The Push to Talk Over Cellular (PoC) service specifications is defined by the Open Mobile Alliance (OMA). It is based on the Session Initiation Protocol (SIP) in the Third Generation Partnership Project (3GPP or 3GPP2) Internet Protocol Multimedia Subsystem (IMS) architecture. More specifically, the PoC service is built on top of a SIP/IP core which can meet the specifications of the 3GPP IP Multimedia Sub-system (IMS) [4, 5] or the 3GPP2 IMS [6, 7].
The overall PoC architecture for the generic case comprises a plurality of PoC clients, each one of them connected to its own Participating PoC Function (over its own network), participating to a common session controlled by a central Controlling PoC Function. All the PoC Functions are connected to the central Controlling PoC Function.
It is important to note that the Controlling PoC Function is responsible for managing who has permission to talk (i.e. who has the permission to send audiovisual media or multimedia packets) at any given time and for copying media packets from one source to multiple destinations. The Participating PoC function cannot perform those operations.
Because of the diversity of the terminals and networks, interoperability issues are arising. For instance, 3GPP mandates the use of AMR (Adaptive Multi-Rate) narrowband speech codec as the default speech codec in the PoC service [2]. 3GPP also mandates the support of the AMR wideband speech codec, if the User Equipment on which the PoC Client is implemented uses a 16 kHz sampling frequency for the speech. On the other hand, 3GPP2 mandates the EVRC (Enhanced Variable Rate Coded) speech codec as the default speech codec [3]. Therefore, 3GPP and 3GPP2 PoC terminals supporting AMR and EVRC audio codecs respectively would not be able to establish a PoC session together, due to incompatibilities. The same incompatibilities are expected to arise for the instant messages containing video and media. To solve this problem, transcoding is required. Transcoding allows converting, in a network element, from one format to another to meet each participant's terminal capabilities.
Since the PoC service is built on top of a 3GPP/3GPP2 IMS SIP/IP core, the media is controlled and processed by the MRFC/MRFP (Media Resource Function Controller/Media Resource Function Processor) [4, 8], which uses the H.248/MGCP (Media Gateway Control Protocol) protocol [9-11] for communication purposes. However these specifications are quite complex and developing a solution which conforms to those protocols requires a huge effort. Also, H.248/MGCP is being criticized and challenged because it is complex, costly and it is the only IMS key system component which is not SIP-based. For those reasons, there is a need to address the problem of transcoding in the PoC application with a more generic framework, which is not limited to MRFC/MRFP and H.248/MGCP. Also, although the MRFC/MRFP functionalities and interfaces are well-defined, their usage in a PoC context is not defined.
In the PoC standard, the need for transcoding is recognized but no detailed solutions are provided. It is said in [1] that transcoding may be performed by both the Controlling PoC Function (CPF) and/or the Participating PoC Function (PPF) without further details. It is therefore important to develop a transcoding architecture that supports various configurations and use cases. In some cases, it is also highly desirable that transcoding be added in a transparent fashion, so that it can work and fit with the already deployed PoC equipment.
In summary, there is a need for a generic solution supporting transcoding in the PoC context. The solution should be compatible with the existing PoC architecture and protocols so as to be accepted and integrated into the standard schemes such as 3GPP, 3GPP2 and OMA. Also the solution needs to be flexible to be able to adapt to different equipment deployment scenarios and constraints.