3GPP specifies AMR and AMR-WB as mandatory speech codecs for voice services in 3G networks. These codecs are also mandatory for the 3GPP VoIP service that is specified within the 3GPP multimedia telephony via IMS. The ruling specification for the media handling and interaction is 3GPP TS 26.114. Despite the mandatory status of these codecs, there is presently the desire within 3GPP to specify new voice codecs that will enable even higher service quality than what is possible with AMR-WB.
However, introducing a new speech codec into a speech communications system may be problematic in some respects. One problem is that there is always an installed base of legacy equipment (both terminals and network infrastructure) that does only support the existing 3GPP codecs or just one of them, for instance AMR-WB, rather than the new codec. This may lead to interoperability problems in which communication between new and legacy equipment is not possible unless proper mechanisms are implemented in the system. Traditional ways to address this problem is the provisioning of transcoders in e.g. media gateways that translate between the new and the old coding formats, or the provisioning of the legacy codecs besides the new codec in new terminals that allows choosing the legacy coding format when a connection to a legacy terminal is established. This latter method requires that there is a capability exchange between the terminals prior to the actual speech connection that identifies the common codec that both terminals support. Within the IMS the session description protocol (SDP) IETF RFC 4566 is used to carry out this capability exchange.
The above described ways for ensuring interoperability when introducing a new codec into a communication system are though not the only possibilities and have various disadvantages. The provisioning of transcoders means additional equipment that raises the network investment and maintenance costs. Transcoding is also associated with undesirable speech quality degradations. Using the capability exchange between the terminals prior to the call is a very elegant way, which however may not always be possible. Examples where this is not always possible are multi-party conferencing, hand-over scenarios with mobile users roaming to cells without MTSI support, voice messaging. Also from a terminal implementation point of view, it may be undesirable to provide support for the complete set of new and legacy codecs as this may increase implementation and technology licensing costs.
Hence, in order to avoid the aforementioned problems a preferable solution is that the new codec is embedded bitstream interoperable with (at least) one of the legacy codecs. While this kind of bitstream “embeddedness” on codec level is a necessary condition for interoperability there are further aspects that need to be fulfilled in order to achieve interoperability on system level. Two further essential aspects are SDP signaling compatibility and compatibility of the bitstream transport formats. With respect to the SDP capability negotiation it is desirable that this can be done between new and legacy devices in a transparent way meaning that the legacy device that is unaware of the new codec still can establish a speech service session with the new device.
The transport format to be used for the speech bitstream data in case of 3GPP MTSI follows the IETF specification for the transport protocol for real-time applications (RTP) IETF RFC 3550 and the speech codec specific speech payload format specification, which in case of AMR and AMR-WB is IETF RFC 4867. Obviously, the legacy terminal relies on that specific speech payload format and it would not be able to create or properly receive a speech bitstream according to another (new) format.
Due to the above discussed problems and requirements; there is a need for enabling session negotiation between new and legacy devices in a transparent manner.