Mobile telecommunications networks can choose between a large number of encoding and decoding schemes (codecs) for speech transmission. However, when two networks select different codecs (or different parts of the same network select different codecs), then communications between those two entities requires tandemming.
For example, a coding sequence between a CDMA (code division multiple access) mobile phone and a GSM (global system for mobile communication) mobile phone may be as follows:                i. A CDMA mobile phone on a first network encodes speech with CDMA codec 1.        ii. Codec 1 encoded speech is transmitted to a CDMA base station.        iii. The CDMA Base station decodes the codec 1 speech and encodes the result using PCM (pulse code modulation).        iv. The PCM encoded speech is transmitted via a wire-line to second, GSM, network.        v. A GSM base station of the second network decodes the received PCM speech and encodes the result using GSM codec 2.        vi. Codec 2 encoded speech is transmitted to a GSM mobile phone on the second network.        
Thus in the above tandemming arrangement, the low bandwidth, high compression codecs used for wireless transmission are linked by a common high bandwidth, low compression PCM encoding scheme for the wireline part of the communication.
However, the resulting end-user received speech tends to be of poor quality. The primary reason is that speech reconstructed from one high compression codec is generally not ideal as input to another high compression codec. Such codecs typically generate high-level parameterisations of the speech with minimal redundancy, with the result that the reconstructed speech used by the PCM contains regularities and approximations not found in the original. A second codec seeking to generate a slightly different set of high-level parameterisations will find that the salient characterising information it assumes to be present has been removed or just interpolated by the first codec. The result is a poor representation of the speech by the second codec.
Currently, the concept of tandem-free operation (TFO) addresses this problem (see ETSI, “Technical Specification Digital cellular telecommunications system (Phase 2+); Universal Mobile Telecommunications System (UMTS); Inband Tandem Free Operation (TFO) of speech codecs; Service description; Stage 3 (3GPP TS 28.062 version 5.3.0 Release 5)” ETSI TS 128 062 V5.3.0 (2002-12)).
However, it only does so if the two networks have the same codec available. That is, the same access technology or compatible (e.g. between AMR (adaptive multi-rate) capable GSM networks and 3GPP (third generation partnership project) networks), and additionally only if end-to-end negotiation on call set-up is possible.
Thus it is not applicable when dissimilar codecs are used or when end-to-end negotiation is not possible or not implemented.
Dilithium Networks also provide a solution to the problems raised by tandemming, known as Unicoding™. (http://www.dilithiumnetworks.com/technology/voice.htm)
This solution requires that one of three alternatives be pursued: Either the first codec's data is conveyed to the second network prior to translation to it's codec format, or the data is translated in the first network to the second codec's format before being sent to the second network, or the data from the first codec is routed to a proxy server to perform the translation and then routed from the proxy server to the second network.
Referring to FIG. 1, Unicoding employs CELP (code excited linear predictive) codec parameter translation from one codec data format 110 to another 130 and requires dedicated translation modules 120, 130 to be available for all possible codec to codec permutations.
This is not a simple solution however as, for example, just for 3GPP2 to GSM networks this would require Unicoding translation modules to be available to and from each of the four 3GPP2 codecs (IS-733, IS-96A, EVR (enhanced variable rate) and SMV (selectable mode vocoder)) to and from each of the three GSM codecs (Full-Rate, Half-Rate and AMR including EFR (enhanced full rate)). These twelve permutations are then further compounded by the multiple available modes for SMV (2 or 3 likely deployment modes) and the 10 modes of AMR, increasing the permutations to 60 or 72. Whilst there would be significant commonality between many of these, the problems of developing and deploying a large number of Unicoding translation modules over a number of networks, and the process of redeployment upon the introduction of any new codecs makes the solution appear unwieldy.
Many of the principles applied in the Dilithium Networks solution can also be found in H-G. Kang, H. K. Kim & R. V. Cox, “Improving Transcoding Capability of Speech Coders in Clean and Frame Erasured Channel Environments,” Proceedings of the 2000 IEEE Workshop on Speech Coding, 2000.
There appears to still be a need for an alternative method of tandem communication that provides both improved voice quality and a simple means of operation across one or more networks.
The purpose of the present invention is to address the above problems.