The use of the Internet to support voice traffic is an emerging technology that offers several advantages over the traditional dedicated circuit-switched connections of the public switched telephone network (PSTN). The delivery of voice data over the Internet using the Internet Protocol's (IP) packet-switched connections is referred to as VoIP. One of the advantages of using VoIP is that it bypasses PSTN toll services by using the Internet backbone for long distance transport. In addition, Internet service providers (ISPs) are exempt from access fees to use local telephone company facilities to complete the call. Since PSTN tolls and access fees are a large part of the cost of all long distance calls, the ability to avoid them is a tremendous advantage.
VoIP offers other advantages over PSTN as well, including bandwidth consolidation and speech compression, both of which contribute to overall network efficiency. However, before these advantages are fully realized, certain technical challenges must be met.
In VoIP, voice data travel as packets of digitized data on shared lines. More than other types of data, it is particularly important that voice packets are delivered in a timely way to achieve voice quality that is comparable to PSTN. This can be particularly difficult when using a public network, such as the Internet, where the level of quality of service (QoS) cannot be assured. A number of competing proprietary and non-proprietary standards have been developed to support the transmission of voice packets. Some of the protocols work better for hardware than for software, and vice versa, but none of the protocols have yet solved all of the problems inherent in sending large volumes of voice packets over the Internet.
As an example, the real-time protocol (RTP) documented in Request For Comment (RFC) 1889 entitled “RTP: A Transport Protocol for Real-Time Applications,” and published in January, 1996, provides end-to-end delivery services for data with real-time characteristics, such as interactive audio and video. FIG. 1 illustrates an example of an RTP packet 100. As illustrated, the RTP packet 100 is comprised of a payload 110 of 40 bytes and four different headers totaling 54 bytes, including a Media Access Control (MAC) header 102, an Internet Protocol (IP) header 104, a User Datagram Protocol (UDP) header 106, and an RTP header 108. The RTP payload 110 is designed to hold voice packets from 5 to 30 milliseconds (ms) in length. Shorter voice packets are considered more desirable, since they result in lower latency and improved voice quality.
Probably the most significant drawback to RTP is the lack of scalability. Because RTP is optimized for sending only a single channel of voice data (i.e. one voice call) per packet long-haul over the Internet, it is necessary to send the packets at a fairly high rate, e.g. 200 packets per second (pps), especially when sending smaller voice packets of 5milliseconds. To support a larger number of voice calls, say 1000, the packet rate increases sharply to 200,000 pps, which quickly degrades the performance of VoIP applications that must process an interrupt every time a packet arrives. Consequently, sending large numbers of packets using RTP tends to degrade the performance of VoIP applications, and makes poor use of bandwidth.
In order to consolidate bandwidth, some protocols aggregate multiple voice channels into a single packet. For example, an aggregated, or multi-channel, version of RTP, developed by the Internet Engineering Task Force (IETF) and documented in an Internet Draft entitled “An RTP Payload Format for User Multiplexing,” by J. Rosenberg and H. Schulzrinne, published on May 6, 1998, multiplexes data from multiple users into a single RTP packet in an attempt to reduce packet overhead and improve scalability to ensure that packets get delivered in a timely way. But the aggregated RTP protocol introduces other problems. For example, while the terminating computers have more than enough power to process one voice call, they can quickly become overloaded when simultaneously processing hundreds of voice calls in a single packet, which can again adversely affect the performance of the VoIP applications.
Another drawback to multi-channel RTP and other aggregated channel protocols for VoIP, is the lack of an explicit voice channel ID, which adds additional processing overhead and makes it difficult, if not impossible, to consolidate packet flows. In voice over multi-protocol label switching (VoMPLS), one of the prior art aggregated channel protocols for voice data, the channel identification data is only 8 bits in length and must be combined with the packet identification in order to fully identify the voice channel to which the data belongs. Thus, for example, the channel 5 data on packet flow A is not the same voice channel as the channel 5 data on packet flow B. Consequently, it is not possible to move channels between packets without additional signaling.
Another drawback to current VoIP protocols is that they are not designed to support explicit 8-byte boundary alignment, which is necessary for efficient processing by 64-bit processors.