As is known, setting up a telephone call between users via terminals which are themselves interconnected over a packet transmission network, requires that packets corresponding to the speech signals of the call and established in real time must themselves be transmitted in at least approximately periodic manner so as to make it possible to play back the sound with relatively good fidelity, and in particular so that at least speech is reproduced in a manner that is sufficiently intelligible. Unfortunately, as is known, the transmission of packets between two terminals over a transmission network that is even only lightly loaded does not guarantee that the packets will all be received at their destination at a regular rate corresponding to their encoding times, nor even that they will be received in the same order as they were sent out by the sender. It is quite normal for packets sent from one terminal to another to be delayed relative to other packets in a manner that cannot usefully be forecast at the destination terminal. In addition, there is a risk that transmitted packets can be lost or even duplicated. This therefore leads to the packets that are received by a terminal being stored temporarily as they arrive so as to build up a buffer of packets on which action can be taken to put the packets back into their initial order, in particular after waiting for packets that have been delayed, providing the delay does not exceed some predetermined threshold value, and after eliminating any duplicate packets. It is normally possible to transmit speech signals in digitized form by means of packets over an asynchronous packet exchange network. However, in the event of these signals being speech signals relating to a call set up in real time between two users, that requires timing constraints to be complied with in terms of delay and periodicity so that the signals can be played back as sound at a determined rate, preferably corresponding to the rate at which they were picked up.
Determining the size of the buffer in which packets are temporarily stored as they arrive at a terminal requires a good compromise to be found. If the buffer is too small, then the number of successively recorded packets that are present simultaneously at any given instant is such that it is possible for late packets still not to have been received, and thus stored and reorganized as initially intended, by the time at which they need to be taken into account in the buffer for reproduction in the form of sound. Under such conditions, the sound signals that are played back do not faithfully reproduce the signals that were initially picked up and from which they are derived. The quality of service obtained can become unacceptable and when the sound signals are speech signals they can become difficult to understand. However, if the buffer is large so as to avoid the above-described drawback, then there can be a long time lapse before the received digitized sound signals are reorganized in the initial order, and when the signals are speech signals relating to a telephone call established in real time, this effect becomes perceptible to users. The quality of service can become highly degraded and a telephone call set up under such conditions runs the risk of being difficult for the users in conversation.
It is possible to modify the size of a buffer in an active terminal as a function of the delays suffered by the packets it receives so as to increase buffer size when there are packets arriving too late to be taken into consideration, thereby making it possible subsequently to accept packets that arrive with an equivalent degree of lateness, or to do so merely when the delays to which received packets are subject to increase on average, or indeed when the mean variation between successive delays increases. Such modification can be based, for example, on statistical processing of the delays that have applied to the most recently received packets. It is also possible to reduce the size of a buffer in an active terminal when the arriving packets are received with delays that are smaller than anticipated and/or when the delays measured on the arriving packets lie within a smaller range of delays than the range currently being accommodated.
Such adaptations of buffer size are preferably performed in the destination terminal during a period of silence on the part of the speaker using the sending terminal so as to avoid interfering with the processing of received signal packets that correspond to genuine speech signals, since it is necessary for such packets to be reproduced with the best possible fidelity. As mentioned above, these adaptations can be performed by taking account of delays as observed on the packets most recently received by the terminal. By way of example, the delay of each packet is determined by observing the time at which the packet was sent, as specified by the header of the message containing it, and also its arrival time which is observed using the clock of the terminal where it is received. This makes it possible in particular to take account in variations in loading that occur specifically in the transmission network, and these variations can be particular large for a terminal which is communicating over a network in which the number of calls that are set up simultaneously can vary very quickly, as is the case for the Internet.
A search for a satisfactory compromise using the above-mentioned method is possible only after a sufficient number of messages containing speech signal packets have been received, which means that a certain amount of time must elapse before there is any genuine possibility of matching the size of a receive buffer to a given call. This is made worse by the fact that it is common practice in an established telephone call to transmit sound signal packets from each of the terminals involved only when speech signals are contained in the sound signals available for transmission, and consequently the only signal packets that are actually transmitted are packets that include speech signals. Such a disposition makes it possible significantly to reduce the load on a network since, as a general rule, only one user is speaking at any given instant on a telephone call set up between two users. Furthermore, this makes it possible to avoid transmitting interfering noise and in particular background noise when, temporarily, no speech signals are being picked up at a terminal that is being used by a user who is silent.
When such a disposition is used, there is no way to preselect accurately an appropriate size for the receive buffer at the destination terminal while a call is being set up, in particular when the possible range of delays to which packets can be subject is large for the network over which the calls are being set up, as is indeed the case with the Internet. The quality of service at the time a call is set up thus runs the risk of being poor and the initial speech runs the risk of being unintelligible, e.g. if it is truncated.