The use of packet-switched networks to transport multimedia traffic, such as telephony, Internet TV, and video services is becoming more widespread. However, it is well known that these services are highly sensitive to delay. In the case of voice calls an excessive round-trip delay is disconcerting to the user and results in the degradation in conversational quality. In this document, the round-trip delay is intended to define the total time for data, speech or other multimedia traffic to be sent from a first user over a transmission medium to a second user and for the response to be sent from the second user back to the first user. The extent to which a user perceives a delay as disturbing depends on a number of different factors, including the language used, the mood of the parties and the type of conversation. In an attempt to find a common standard, the International Telecommunication Union (ITU) has proposed a one-way (mouth-to-ear) delay threshold in ITU-T G.114 of 150 ms, above which delay is considered to impinge on quality.
For packet-switched networks and services, and especially for IP networks, the delay will depend on a large number of factors which cause a lesser or greater delay variation at different times and locations. Among these factors are the network topology and the components used, which may vary greatly from one IP end-to-end voice call to another, usually without the end points having knowledge of the networks concerned. Network load is also a significant factor affecting the delay and also the delay jitter. A high network load results in long queues in routers and hence in increased delay. IP networks transmitted over wireless links are also sensitive to radio conditions, which affect the transmission time resulting in a longer overall delay when conditions are bad. A further factor is the time required for packetization. If longer speech frames are used or the number of speech frames included per packet increases delay will also increase.
While the transmission delay in packet-switched networks will differ for different paths, it is nevertheless an advantage to know the delay for a specific link, particularly for VoIP services. This advantage lies in the possibility to adapt the client node behaviour to the expected delay. For example, the function of a jitter buffer in a receiving node can be modified to accept more late losses when delay is likely to be long in order to minimize additional delay. Late loss is a term given to packets that are discarded at a receiver if they arrive after a certain delay. Conversely, when the network delay is short, the jitter buffer can buffer frames for a longer length of time to reduce the late losses.
A difficulty in monitoring delays over a network, particularly in radio access networks supporting packet-switched multimedia traffic, is that traffic will typically be classed in queues according to priority, with each queue being shared by several users or even by different traffic types, i.e. voice, data, video for the same user. Such networks are ideally unaware of the services they are carrying. The consequence is that monitoring the performance of individual streams is very problematic on a network level.
In a circuit-switched system, delay is a system design parameter. Moreover, delay does not vary in the network as it is set up for voice calls. In the Global System for Mobile Communication, GSM, the mouth-to-ear delay is designed to be around 200 ms. For Wideband Code Division Multiple Access, WCDMA, mouth-to-ear delay is designed to be around 225 ms. However, depending on the number of networks included in any specific link between two end nodes, there are still occasions when the real delay is very different from these standards. In such cases the knowledge of the real delay can be useful.