The ITU-T SG12 (International Telecommunication Union Standardization Sector's Study Group 12) is currently standardizing systems for measuring Quality-of-Service (QoS) for Voice-over-IP (VoIP) from an end-user perspective. Central to all such systems is the effect of packet loss on the voice signal. Packet loss typically results in small (usually 10-40 millisecond) speech segments being removed from playback of voice, and can result in audible distortions. Therefore, most QoS measures are based on packet loss information relating to the quantity and/or distribution of lost packets, such as, for example, the packet loss rate (representing a percentage of lost packets to total packets) or a “burst ratio” of lost packets (representing the length of successive strings of lost packets). (See, e.g., U.S. patent Application Publication No. US 2002/0154641 A1, “Burst Ratio: A Measure Of Bursty Loss On Packet-Based Networks, published on Oct. 24, 2002, filed on Feb. 1, 2001 by James W. McGowan as U.S. patent application Ser. No. 09/773,799 and commonly assigned to the assignee of the present invention. U.S. patent application Publication No. US 2002/0154641 A1 is hereby incorporated by reference as if fully set forth herein.) Other QoS measures based on lost packet information include the “packet loss distortion rate” and the “media distortion rate,” each of which is based on lost packet data as well as data comprised in packets which are not lost but whose proper interpretation is based on data from packets which are lost. (See, e.g., co-pending U.S. patent application Ser. No. 10/936,990, “Method And Apparatus For Performing Quality-Of-Service Calculations On Packet-Based Networks,” filed on Sep. 9, 2004 by M. Lee and J. McGowan and commonly assigned to the assignee of the present invention. U.S. patent application Ser. No. 10/936,990 is also hereby incorporated by reference as if fully set forth herein.)
In the presence of packet loss, packet loss concealment (PLC) schemes are typically used in an attempt to mitigate the effect of distortions on the listener by replacing missing speech data from lost packets with substitute speech data. For example, packet repetition is one simple PLC scheme that repeats the last correctly received packet (or a scaled version of that packet) when a packet loss occurs.
Some of the motivation behind using a scheme such as packet repetition for packet loss concealment is the observation that speech tends to follow “quasi-stationary” (QS) behavior—that is, both the pitch and spectral envelop usually vary slowly relative to the packet size. The best estimate of the speech in a lost packet, therefore, is often the speech in the previous and/or the following packets. (To minimize delay, typically only the previous packet is used, although backward prediction from subsequent packets has also been used and does tend to improve prediction.)
The use of PLC techniques such as packet repetition, however, can result in sharp discontinuities at the boundaries between packets, although there are known methods for minimizing the effects of these discontinuities. More sophisticated algorithms, such as the well known speech coding standard G.711 PLC, attempt to increase voice quality by varying the temporal extent of the repeated portion as well as making some adjustments for distortions introduced at the boundaries. Nonetheless, they also rely on the assumption that the lost speech and previous speech have essentially identical pitch and spectral envelop.
However, it is known for example that the QS assumption fails (i.e., a “QS failure” occurs) whenever a talker begins or ends a phoneme, the smallest unit of sound in a language. Certain sounds (such as diphthongs) show dynamic spectral characteristics within a phoneme. This occurs up to several times per second in normal speech, depending upon the language, talker and the individual words being spoken.
In addition, the perceptual effect of packet loss on the end user depends, in part, on how often this QS property is violated, since violations of the QS property indicate occasions that the PLC scheme is based on a faulty assumption, and is therefore likely to fail to adequately conceal the packet loss. Although there are a number of known speech quality (i.e., QoS) measures, currently none of these measures accounts for this critical effect.