It is a time critical task to process and deliver real-time media, such as conversational media. The main general task for conversational system designs and solutions is to keep end-to-end (E2E) delay (also sometimes called “mouth-to-ear” delay) as low as possible.
Different media processing and transport algorithms or solutions are used to improve media and services quality. Some examples of these improvements include the jitter buffer, bit and transport block error correction algorithms, retransmission algorithms etc. However, almost all such processing algorithms aimed to improve service and media qualities introduce additional latencies, and therefore lead to increases in E2E delay.
Current approaches in system design typically try to find ways to trade off E2E delay for perception of quality of service (QoS). However, it is becoming increasingly difficult to maintain a balance between E2E delay and QoS measures because transport of real-time voice traffic involves latency constraints, bit errors, packet data jitter and data loss.
Bit errors and packet data losses are handled primarily through a variety of different forward error correction (FEC) algorithms and local repair at the receiver. The basic principle of FEC is simply that audio frames or packets contain information related to earlier or later speech samples in addition to the distinct speech samples. This additional information can be used to reconstruct missing or erroneous speech samples. However, this solution implies added delay that would be attributed to waiting for the arrival of additional information needed to repair or reconstruct the speech data.
Packet data jitter occurs when audio and/or video packets arrive at the receiver at times that vary from an expected or “ideal” position in time. Upon playback, the jitter results in a jerky playback of the video frames or noticeable decrease in voice quality. Jitter may be compensated by means of a jitter buffer, which involves building a buffer of incoming packets at the decoder to smooth out the incoming jitter. However, buffering the incoming packet necessarily adds delay to the link.
The tradeoff between the use of these algorithms and schemes, and the desire to keep E2E delay as low as possible is typically addressed by using delay thresholds (DT) for algorithm design and latency limit. It is assumed that when E2E delay is kept below a predetermined DT, latency introduced by QoS algorithms and schemes and E2E delay levels would not be detected or perceived by human subjects (i.e., perceived as a service quality impairment).
A DT is commonly determined using subjective test data collected during extensive and costly subjective tests, which are usually carried out according to ITU recommendations. After determining a DT, system and processing circuits are designed using the subjectively defined and hard (fixed) DT.