With the proliferation of data networks such as the Internet, there is a growing demand to transmit real-time voice and audio-visual signals over such networks. However, transmission of real-time voice and audio-visual signals is not a simple task, since most data networks were not designed to handle this type of traffic.
Perhaps the biggest impediment to the efficient transmission of high-quality real-time voice data is voice data's strict latency requirements. It has been found, for example, that if voice packets are delayed even by as little as 200 ms, the quality of the voice signal is significantly degraded. If a large temporal gap appears in the middle of a word or phrase, the listener may not be able to understand what is being said, and, in any event, will probably soon become annoyed or fatigued. Thus, to meet the latency requirements of voice data, Internet Protocol (IP) networks typically employ a connectionless protocol such as the User Datagram Protocol (UDP) to send voice signals, rather the Transmission Control Protocol (TCP) commonly used to transmit other types of data signals. UDP provides higher throughput and lower latency than TCP, but offers these benefits at the expense of data integrity.
While data networks can, through the use of protocols such as UDP, improve the quality of voice transmissions, problems still arise when excessive traffic on the network causes network congestion, since data networks do not naturally handle congestion in a manner conducive to the effective transmission of real-time data. Network links will often be called upon to handle multiple flows of data (a flow of data includes packets traveling from one source to one destination) simultaneously, and thus will typically queue the packets they receive before sending them on to the appropriate destination. The queuing mechanisms commonly employed in such networks are typically not sensitive to the latency requirements of real-time data, and thus are prone to producing unacceptable levels of delay or jitter in the real-time signal.
For example, a typical network queue is called upon to handle data packets of varying sizes, and some of the data packets are, for efficiency reasons, relatively large. However, these large data packets can cause degradation of voice signals being transmitted through the same queue, since the voice packets are slowed if they must wait for the link to transfer the large data packets. This problem cannot be solved by simply giving voice packets priority over large data packets in the queue because such a scheme could effectively trap the large data packets in the queue, thus unacceptably interfering with their transmission. Moreover, even if voice packets were given the highest priority in the queue, they could still experience unacceptable delays if they were to arrive in the queue just as a large data packet was beginning to be transmitted, since they would have to wait for the transmission of the large data packet to finish before they could be transmitted.
One way to reduce these problems is to fragment large data packets into smaller, more manageable packets. Fragmentation is undesirable, however, as it reduces network efficiency by increasing the amount of data headers that must be transmitted, thus increasing network bandwidth requirements and slowing transmission of data. Packets typically consist of a fixed-length header containing protocol and routing information and a variable-length payload containing the actual data that is to be communicated. Fragmentation breaks up the payloads of large packets, creating two or more smaller packets, each having its own header. As a result, fragmentation decreases the efficiency of transmitting the information contained in the original, large payloads by reducing the size of the payload relative to the size of the header.
Moreover, the strict latency requirements of real time signals such as voice often dictate a relatively high degree of fragmentation. For example, while a data network may be able to support a maximum transmission unit (MTU) of 1500 bytes, a voice signal will often require a much smaller maximum allowed transferable unit (MATU) so that latency is reduced. For example a MATU of no greater than 256 bytes may be required. The distinction between the MTU and the MATU is that the MTU is set for a network and does not change depending on the traffic on the network. When a type of traffic is carried by the network with a strict latency requirement, the network may be further constrained to transfer units that are smaller than a MTU. The MATU is smaller than the MTU and changes depending on the type of traffic carried by the network.
In addition, since many routers are unable to detect the presence or absence of voice data, if a data network is used to transmit voice data, the routers in the network typically need to be set to fragment every large packet they receive, regardless of whether any voice signals are active.
In sum, while it is possible to send latency-sensitive signals over a data network, doing so using prior art fragmentation techniques can compromise the overall efficiency of the network. What is needed is a way to control the fragmentation of packets so that the latency requirements of real time data, such as voice, are met without unnecessarily compromising network efficiency.