In Voice-over-IP (VoIP) communications systems, voice signal data is transmitted across a telecommunications network to a receiver as a series of discrete packets. Each packet contains a sample of speech material, each typically comprising one speech “frame,” and the speech material of the transmitted packets is then combined, in sequence, with the other transmitted packets, at the network receiver. (Speech signals are typically divided into a contiguous sequence of “frames,” where each such speech “frame” is a speech segment represents a predetermined time interval, such as, for example, 20 milliseconds.) Thus, the receiver is able to reconstruct the transmitted speech signal for appropriate playback to a listener.
However, transmission of information over Internet Protocol (IP) networks is more specifically accomplished with a series of stacked protocols responsible for sending and receiving packets. Each protocol handles the communication between one specific component of the network on the sending side and its peer on the receiving side. For example, the Internet Protocol (IP) itself contains information relevant to moving packets between routers. The Hypertext Transmission Protocol (HTTP) contains information relevant to moving hypertext (i.e., specially formatted text) between a web server and a web browser. The hypertext is referred to as the “payload” of HTTP, and the additional information is referred to as a “header” (or, less commonly, a “footer” when it is transmitted after, as opposed to before, the payload). An entire HTTP packet—payload and header combined—is delivered over an IP network as the payload of an IP packet. This stacking of the headers can include many layers depending upon the application and network design.
In an application such as Voice-over-IP (VoIP), for example, a payload of voice data may have User Datagram Protocol (UDP), Real-Time Transport Protocol (RTP), and IP headers added, plus additional headers for the physical layer. (Note that each of the above-identified protocols is fully familiar to those of ordinary skill in the art.) The net effect is that the size of the headers will far exceed the size of the payload. Thus, most of the bandwidth in the network is dedicated to overhead unrelated to the voice data itself.
One technique for decreasing the VoIP overhead is to place multiple voice frames in a single packet. This technique of placing multiple voice frames in a single packet is known as “bundling.” Consistently placing two frames in every packet, for example, will clearly reduce the overhead by 50%. Bundling techniques are familiar to those of ordinary skill in the art, and such techniques are commonly employed when a packet scheduler becomes overloaded. (A packet scheduler is an algorithm responsible for the delivery of packets over a network, which may, for example, comprise an air interface.) If the packet scheduler, because of limited bandwidth, cannot service a user within the frame rate, that user will end up having two packets waiting to be transmitted when they are successfully scheduled.
Unfortunately, packet bundling requires buffering, and thus, delaying, one or more frames until another is ready. In a typical VoIP application, this will likely add, for example, 20 milliseconds of delay (i.e., the frame rate of the codec, which, as pointed out above, is commonly 20 milliseconds) for each buffered frame. This added delay often detrimentally affects the natural back and forth nature of a typical voice conversation. (This form of delay is known as “conversational delay.”) Therefore, for typical telephony networks, the bandwidth savings does not usually justify the added delay associated with preemptively delaying packets.
In U.S. patent application Ser. No. 11/062,966, “Method And Apparatus For Handling Network Jitter In A Voice-Over IP Communications Network Using A Virtual Jitter Buffer And Time Scale Modification,” filed by M. Lee et al. on Feb. 22, 2005, and commonly assigned to the assignee of the present invention, a method for handling network jitter in a communications network using a virtual jitter buffer and time scale modification is provided wherein the time scale of individual voice packets are modified based on the location of a voice packet within a talk spurt. (A talk spurt is a continuous stream of a user's speech between periods of his or her silence.) In accordance with the method provided therein, a “virtual” jitter buffer is thereby effectuated, providing network jitter protection in the “middle” of a talk spurt, while allowing the virtual jitter buffer length to become essentially zero at each talk spurt beginning and end. In this manner, the vast majority of voice packets (i.e., those in the “middle” of the talk spurt) are protected from network jitter, while the fact that there is a zero length jitter buffer at the beginning and end of the talk spurt results in there being no perceived added conversational delay. (Methods such as the one provided by U.S. patent application Ser. No. 11/062,966 will in general be referred to herein as “talk spurt management” techniques.) U.S. patent application Ser. No. 11/062,966 is hereby incorporated by reference as if fully set forth herein.
In addition, in U.S. patent application Ser. No. 11/078,012, “Method And Apparatus For Routing Voice Packets In A Voice-Over IP Communications Network Based On A Relative Packet Location Within A Sequence,” filed by M. Lee et al. on Mar. 11, 2005, and commonly assigned to the assignee of the present invention, a method for routing voice packets in a VoIP communications network is provided wherein the routing priority is based, for example, on the location of a packet within a talk spurt. In particular, the method of U.S. patent application Ser. No. 11/078,012, in one embodiment thereof, takes advantage of the recognition that when the above-described method of U.S. patent application Ser. No. 11/062,966 is employed to perform talk spurt management, then it may be advantageous to give packets at the beginning and the end of the talk spurt a higher routing priority than those in the middle of the talk spurt. U.S. patent application Ser. No. 11/078,012 is also hereby incorporated by reference as if fully set forth herein.