The present invention relates to packet switching in data communications networks and more particularly, to a method and apparatus for selectively processing and forwarding received packets in a media processor to reduce latency.
Packet based media processors are deployed to operate on packets of sampled, and sometimes compressed, media data being transmitted across packet switched networks. The types of media contained in the packets can include samples of voice, music, telephony signaling tones and modem signals. In many applications, the packets originate and terminate at gateway and endpoint devices that interface the packet switched networks to the synchronous circuit switched network. The gateway and endpoint devices are generally designed to perform certain functions required in packet-based media communications, such as de-jittering, line echo cancellation, decoding and sample clock regeneration.
Due to the nature of packet switched networks, it is not possible to ensure the arrival rate, arrival sequence, or even arrival itself of all packets in a particular media stream. The possibilities of uneven arrival rate (referred to as “jitter”), out-of-sequence arrival, and non-arrival or loss of packets create problems in re-creating the synchronous sampled media at a receiver such as a gateway or endpoint. Uncompensated jitter can result in significant distortion of the re-created media stream. Additionally, in most cases it is not practical to decode packets out of order or to ignore lost or late packets without degradation in perceived audio quality, for example, or bit errors in facsimile or data signals. Therefore, endpoint devices are typically forced to somehow compensate for these undesirable characteristics of the stream.
One common compensation technique is to employ buffering to smooth out the timing variations and sequence gaps of a received packet media stream. Received packets are placed into a buffer, termed a “jitter buffer”, in an asynchronous manner as they are received, and removed from the buffer at a constant rate to achieve the desired fidelity in re-creating the original analog signal. The buffer must be large enough to provide sufficient timing elasticity to enable fixed-rate removal of packets for specified worst-case values of jitter and mis-ordering. The buffer is generally filled to a specified depth before the fixed-rate removal of packets is initiated, to minimize the potential for buffer underflow during times of greater-than-nominal packet spacing. This initial filling represents a fixed delay experienced by the media that is never recovered.
For voice data traveling through packet switched networks, users are able to perceive a delay approaching 200 milliseconds, which is typically associated for example with very long distance circuit switched calls. If the delay is more than 200 milliseconds, it can begin to affect the natural feeling and dynamics of the conversation. Because of the duplex and bursty nature of voice, it may be acceptable to occasionally drop packets that may be late without affecting the overall perceived quality of the call. In some cases, then, voice packets may be judiciously dropped to reduce the delay experienced by a media stream. However, this approach is limited in its effectiveness. Additionally, for other media data such as modem signals, the end-to-end delay is not as important as is the need for smoothing and preventing discontinuities in the reconstructed signal that can result from late or out of order packet arrival.
Therefore, it is important to both minimize any delay introduced by intermediate processing devices in an end-to-end routing path of a packet media stream, and to provide for the best quality reconstructed signal by re-ordering packets where necessary.