It is a problem in the field of Internet Protocol Networks that some of the data packets may fail to arrive at their intended destination. Transmission protocols such as TCP/IP permit receiving devices to request that missing packets be retransmitted; unfortunately, this retransmission process often results in long pauses in the data stream, as well as data transmission latencies of more than several hundred milliseconds, thereby rendering schemes such as TCP/IP inappropriate for most telephony applications.
For these reasons, Voice over Internet Protocol (VoIP) systems commonly use a transmission scheme called User Datagram Protocol, or UDP. This mechanism does not suffer from the pauses or transmission latencies that would be seen if TCP/IP were used for VoIP, chiefly because, unlike TCP/IP, there is no retransmission of missing packets. Instead, IP networks often try to reduce VoIP packet loss by assigning a higher priority (commonly referred to as Quality of Service or QoS) to UDP packets. Concurrently, many VoIP telephones incorporate packet loss concealment algorithms that try to trick the human ear by replacing the missing packet with data that is extrapolated from the data received or with data that is commonly referred to as “comfort noise.”
Unless the level of packet loss becomes extreme (on the order of 5% or greater, depending on the audio encoding algorithm being used), the use of high quality packet loss concealment algorithms allows UDP to be an acceptable transmission protocol for person-to-person voice conversations. This is because it is relatively easy to trick the human ear into hearing something that isn't there. Unfortunately, the packet loss concealment algorithms of the present art do not mitigate the deleterious effects of packet loss on many accuracy-sensitive applications for which voice channels (and therefore UDP) are commonly used; examples include automatic speech recognition systems, automatic speaker identification systems, and the TTY/TDD communication commonly employed by people with hearing deficits.
It is of interest to note that applications such as these, which tend to be very sensitive to the effects of packet loss, tend not to be especially sensitive to the effects of latency. Illustratively, point-to-point transmission delays on the order of half a second would be unacceptable in a voice conversation between two people, but would probably not be noticeable in a TTY/TDD conversation, or when an individual is speaking commands to a typical automatic speech recognition system. In other words, these are applications for which it would make sense to accept a greater degree of latency in exchange for reduced packet loss.
A superficial analysis of this problem might cause one to conclude that the use of TCP/IP for these applications, rather than UDP, might be a reasonable solution. Although the use of TCP/IP would provide for the retransmission of missing packets, there are other considerations that render this approach impractical. Reasons include:                (1) Transitioning back and forth between TCP/IP and UDP on the same call would be difficult to support from an engineering standpoint, and is not even permitted within existing Internet standards. An example of where this might be needed would be a call in which one of the parties is hearing-impaired, but not deaf; these individuals often prefer to intermix voice and TTY/TDD on the same call.        (2) The adding of a resource that requires TCP/IP to a pre-existing UDP connection would be difficult to support from an engineering standpoint, and is not even permitted within existing Internet standards. An example of this type of situation would be a telephone conversation between two people, in which an automatic speech recognition resource is added to the call.        (3) There is no mechanism within TCP/IP to ensure that the transmission pauses, while waiting for retransmitted packets to arrive, occur in places where they will do no harm to an audio stream (e.g., between spoken words, rather than within a word, or between TTY characters, rather than within a character).        (4) If audio packets are tagged as TCP/IP, rather than UDP, VoIP QoS mechanisms within the Internet may fail to classify these as high priority packets, thereby exacerbating the packet loss problem even further.These and other problems are addressed by the disclosures contained herein.        