This invention relates to voice over packet (VOP) applications and more specifically to voice over DSL (VODSL) applications. VOP is used to provide cost-effective telephone services as an alternative to Public Switched Telephone Networks (PSTN's) using data networks. VODSL uses DSL links and Asynchronous Transfer Mode (ATM) protocol to enable the delivery of multiple telephone calls over a single pair of wires, in addition to providing a data link. An example of a DSL-based system that provides a multiple-calls telephone link as well as a data link is illustrated in FIG. 1.
The system 10 shown in FIG. 1 is representative of a system with which the present invention may be used. The system 10 includes an ATM Gateway 12, 14 at each end connected to a DSL Gateway 16, 18 via a copper twisted pair 20, 22, a plurality of telephones 24, 26, 28, 30, and a personal computer connected to the Internet (32). The system 10 shown in FIG. 1 also includes an “ATM cloud” 31 which represents other structures of the network. While the personal computer 32 is connected to the DSL Gateway 16 via a digital connection 34, the connections 36 between each of the telephones 24, 26, 28, 30 and the respective DSL Gateway 16, 18 are continuous analog connections. For example, if someone on telephone 26 is having a conversation with someone on telephone 30, and the person on telephone 26 speaks, the analog voice signal which DSL Gateway 16 receives from telephone 26 is sampled, packetized and sent over the network. Packets that are received by DSL Gateway 18 that arrive from the network are processed and converted back to an analog voice signal which is then sent to telephone 30.
A well-known problem with regard to using packets in a network to deliver real-time voice packets within a telephony application is the network delay, and more specifically the variance in the delay. The time it takes for a specific packet to travel from the source location to the destination is not constant and is a function of the instantaneous load of the switches between the two end points of the link. When the delay is short, a packet will arrive before it is supposed to be played, so there is no voice degradation. In contrast, when the delay is long, a packet may arrive after the time the packet was supposed to be played. In that case, the packet is tossed away, and the quality of voice is degraded.
The variance in the delay in the arriving of packets through a network is called jitter. To solve the jitter problem, systems use a de-jitter buffer. FIG. 2 illustrates the input and output scheme of a typical de-jitter buffer 40. With regard to the input, voice packet data arrives at the de-jitter buffer 40 in an unsynchronized fashion. That is, every time a packet arrives, it is received by the de-jitter buffer 40 and stored therein. First, the de-jitter buffer 40 is initialized. During initialization, the de-jitter buffer 40 is centered (or primed or initialized) to a nominal delay. The nominal delay is equal to the amount of de-jitter that the system can handle. For example, if each voice packet represents 10 milliseconds of voice, and the nominal delay is set at 50 milliseconds, the de-jitter buffer 40 will not send a packet out before there are at least five packets stored in the buffer 40. With regard to the output, after the de-jitter buffer 40 has been initialized (and a pre-determined initial number of packets have been stored in the de-jitter buffer 40), packets are read from the de-jitter buffer 40 in constant time intervals, such as one packet every 10 milliseconds, wherein exactly every 10 milliseconds, the local receive procedure pulls a packet from the de-jitter buffer 40. If one or more packets have been delayed in the network, the fact that five 10 millisecond packets have been stored in the de-jitter buffer 40 provides that the receive procedure can pull packets for the next 50 milliseconds from the de-jitter buffer 40 without degradation of the voice quality. When packets arrive faster from the network that they are pulled from the de-jitter buffer 40, they accumulate in the de-jitter buffer 40, and are not discarded.
A special procedure associated with de-jitter buffers allows packets that arrive out of order to be sorted so that these packets will be played (i.e. through the telephone) in the correct order. The bigger the de-jitter buffer (and the nominal delay), the better the de-jitter buffer can handle delay jitter. However, the bigger the buffer, the more delay the de-jitter buffer introduces to the system. Generally, total delay exceeding 150–250 milliseconds degrades the quality of the conservation over the network. Thus, the characteristics of a de-jitter buffer must be tuned to the characteristics of the network delay. The characteristics of the network delay might change constantly, especially in a packet switching network. Many adaptive algorithms have been suggested to achieve the “best” nominal delay for a given state of the network. Most of the algorithms which have been formulated have been based on a complex statistical analysis of the characteristics of the network. These techniques suffer from several drawbacks. If the technique is used, it is important to properly analyze the characteristics of the network delay because an error in the adaptive algorithm can result in additional degradation of the voice quality. Analyzing the characteristics of a network delay often requires a lot of complex computations which makes it very expansive in terms of computer power for multi-line systems. In addition, fixed point processors are commonly used in association with voice processing applications, and the computations associated with analyzing the delay characteristics of a network involve floating point calculations. Floating point calculations are difficult to implement in a fixed point processor.