In typical telecommunications systems, voice calls and data are transmitted by carriers from one network to another network. Networks for transmitting voice calls include packet-switched networks transmitting calls using voice over Internet Protocols (VoIP), circuit-switched networks like the public switched telephone network (PSTN), asynchronous transfer mode (ATM) networks, etc. Recently, voice over packet (VOP) networks are becoming more widely deployed. Many incumbent local exchange and long-distance service providers use VoIP technology in the backhaul of their networks without the end user being aware that VoIP is involved.
An example of networks and components for a VoIP call is illustrated in FIG. 1. The diagram shows a communication network that could be any managed network accessing the Internet such as an packet network with IP protocols, Asynchronous Transfer Mode (ATM), or Ethernet network. The communications network comprises a router 14 connected to various customer premise equipment and to media gateway 12. Media gateway 12 must be capable of detecting changing resource or network conditions. The ability to detect and monitor changing resource and network conditions can result in significant cost reductions and/or improved quality. Router 14 is connected to Internet Access Device (IAD) 16, wireless access point (AP) 22, and/or IP PBX (personal branch exchange) 23. A voice call may be placed between any of the customer equipment phones 18 connected to IAD 16, wireless IP phone 24 connected to AP 22, or IP PBX phone 30 and POTS (plain old telephone system) phone 32. Using special software, calls could also be placed through computer 20 connected to IAD 16 or portable computer 26 connected to AP 22.
Customer equipment is connected through access the broadband network 28 to the Internet 34 by media gateway 12. On the far end is the PSTN 48 connected to a POTS phone 52 through a Central Office 50. The PSTN 48 is also connected to the Internet 34 through a trunk gateway, composed of a signal gateway 46, a media gateway controller/proxy (MGC) 32, and a trunk media gateway (MG) 42. The IP and packet data (e.g., real time protocol (RTP packet data)) associated with the call is routed between the IAD 16 and the trunk MG 42. The trunk gateway system provides real-time two-way communications interfaces between the IP network (e.g., the Internet) and the PSTN 48. As another example, a VoIP call could be initiated between a wireless IP phone (WIPP) 24 and another WIPP 40 connected to AP 38. In this call, voice signals and associated packet data are sent between a MG 12 and a MG 36 through Internet 34, thereby bypassing the PSTN 48 altogether.
Factors that affect voice quality in a VoIP network are fairly well understood. The level of control over these factors will vary from network to network. This is highlighted by the differences between a well-managed small network enterprise verses an unmanaged network such as the Internet. Network operational issues affect network performance and will create conditions that affect voice quality. These issues include outages/failures of network switches, routers, and bridges; outages/failure of VoIP elements such as call servers and gateways; and traffic management during peak periods and virus/denial of service attacks.
Interoperability between VOIP systems is a critical ingredient of high-quality VOIP systems. There are many software and hardware devices in a VOIP system that must be implemented in order to reach the quality of carrier-class systems. The most important software features include echo cancellation, voice compression, packet play-out software, tone processing, fax and modem support, packetization, signaling support, and network management. New networking technologies and deployment models are also causing additional challenges that affect the ability of VoIP service providers to guarantee the highest levels of service quality (e.g., toll quality) in their deployments. Two such examples are where the VoIP service provider does not control the underlying packet transport network, and the use of packet networks with potentially high delay and loss, such as in 802.11 WLAN (Wireless Local Area Network) technology.
A problem affecting the interoperability of VOIP systems, and hence the quality of voice systems, is a problem with interoperability between two widely-used but incompatible packing formats for Real-time Protocol (RTP) loads when using ADPCM. Adaptive Differential Pulse-Code Modulation (ADPCM) is a widely-used coding technique for digital communications over a computer network that uses a method of predictive coding to achieve data reduction. An advantage of ADPCM is a bit rate reduction by the use of an adaptive scale factor and quantizing according to a fixed quantization curve. The result of the incompatible packing formats is garbled audio when a caller implements one of the formats and a receiver implements the opposing format.
One standard is the ITU-T standard G.726, titled “40, 32, 24, 16 kbit/s ADAPTIVE DIFFERENTIAL PULSE CODE MODULATION (ADPCM),” describes an algorithm for conversion of a single 64 kbit/s A-law or mu-law PCM channel encoded at 8,000 samples/s to and from a 40, 32, 24, or 16 kbit/s channel. The conversion is applied to the data stream using ADPCM transcoding methods. The G.726 data rates of 40, 32, 24, and 16 kbit/s have codewords of 5, 4, 3, and 2 bits, respectively, and are described as G726-40, G726-32, G726-24, and G726-16. Samples for G.726 encoding must be packed into octets using “little endian” ordering. Big endian or little endian packing methods indicate packing bytes in a certain order according to what bytes are most significant or least significant. Big endian systems sequence bits where the most significant bit in a sequence is stored at the lowest, or first, storage address, whereas in a little endian format the least significant bit in the sequence is stored first.
For G.726 the 4-bit code words must be packed into octets wherein the first code word is packed in the four least significant bits (LSBs) of the first octet and with the LSB of the code word in the LSB of the octet. The second code word is placed in the four most significant bits (MSBs) of the first octet, with the MSB of the second code word packed into the MSB of the octet. The packing of code words continues in this manner with the first code word of each pair of words placed in the least significant four bits of the octet, and so forth.
The “little endian” method for packing samples into octets in the G726-16, -24, -32, and -40 formats for RTP payloads is the same packing method that is specified in ITU-T Recommendation X.420 for packing ADPCM samples into octets. Internet Engineering Task Force (IETF) adopted this format for G726-40, -32, -24, -16 RTP payloads.
The opposing packing format is the ITU-T Recommendation I.366.2 Annex E for ATM AAL2 (ATM adaptation layer 2) transport that specifies big-endian format for the same. This has resulted in interoperability problems in the VOIP industry as many vendors have adopted the AAL2 format for RTP payloads too.
The revised AVT-RTP-Profile (RFC 3551) has attempted to resolve this issue by discontinuing the use of payload type “2” for G726-32 and has recommended the use of dynamic RTP payload type. Also for the 1.366.2 (Annex E) format, new MIME (multipurpose Internet mail extension) subtypes of AAL2-G726-16, -24, -32, -40 are specified and MIME registration of the same is expected to happen soon. This probably can solve the problem in some implementations going forward, however, interoperability with the installed base of VOIP devices is not ensured.
G726-32 with dynamic payload is likely to indicate that the payload conforms to IETF specification, however, there is nothing that prevented use of dynamic payload for G726-32 in older implementations. Thus, in many older implementations the type of payload format cannot be determined remotely. Moreover G726-16, -24, and -40 have always used dynamic payloads, so relying on payload alone can result garbled audio.
A gateway compliant with RFC 3551 and implementing G.726 can probably support G726-XX as well as AAL2-G726-XX payload formats. However, when the gateway's session description protocol (SDP) contains G726-XX alone, there is no way for the gateway to determine the payload format conclusively. For some signaling protocols, it may be possible to indicate support for both payload formats. However, there is no method for an existing gateway to determine if the payload format of a remote gateway negotiates using only G726 as described above.
One solution for ADPCM interoperability is proposed in the IETF's RFC 3551 standard “RTP Profile for Audio and Video Conferences with Minimal Control,” by Schulzrinne, H. and Casner, S. (July, 2003). RFC 3551 has only solved the issue for interoperability among future systems. As far as currently existing systems in the field are concerned the gateways can not determine the payload format conclusively. Clearly, there is a need for gateway to determine the G.726 payload format conclusively to prevent garbled audio output when it encounters compliant systems.