Telephony devices, such as telephones, analog fax machines, and data modems, have traditionally utilized circuit-switched networks to communicate. With the current state of technology, it is desirable for telephony devices to communicate over the Internet, or other packet-based networks. Heretofore, realization of an integrated system for interfacing various telephony devices over packet-based networks has been difficult due to the different modulation schemes of the telephony devices.
Network traffic for voice-over-Internet-protocol (VoIP) service consists of a stream of speech data packets, each providing a limited amount of speech playback time. In order to provide continuous speech, playback packets must arrive at regular intervals. The time that a packet takes to traverse the network varies, however, and is a function of a number of factors including, but not limited to, the number of nodes, the speed of the communications links, and the queuing delay that occurred at each node in the path. Variations in network delay, normally referred to as ‘delay jitter.’ occur as a part of normal packet network operation. Estimating network delay jitter is a challenging problem, since delay jitter can change quickly, and the delay jitter estimators typically adapt more slowly.
VoIP equipment compensates for variations in the network delay by queuing sufficient speech packets in a “jitter buffer.” The proper sizing of the jitter buffer and management of speech playback depends upon having an accurate estimate of network delay jitter. Existing VoIP equipment depends upon internally generated estimates of network delay jitter, and operates without the benefit of external feedback. Internally-generated network delay jitter estimates may cause the algorithms that manage the jitter buffer to underestimate or overestimate the amount of speech data that is required, resulting in repeated or dropped speech frames. The repeating or dropping of speech frames typically generates audio impairments that become increasingly evident and unacceptable as the number of repeated and dropped frames rises.
An additional problem with existing VoIP operation relates to echo cancellation and suppression. There are several sources of echo that degrade the quality of an Internet protocol (IP) telephony connection, including the electronic hybrid circuit that converts the four-wire path used within the transmission network to the two-wire path used in the public switched telephone network loop, and the acoustic echo caused by the coupling of audio from the receiver to the transmitter of the voice terminal. The impact of these echo sources on call quality is primarily a function of the round-trip delay of the path between the parties of interest. If the round-trip delay is short, echo is indistinguishable from sidetone. In systems with far-end echo cancellers, the echo cancellers and suppressors are typically initialized with a predetermined bulk or round-trip delay value, and the round-trip delay estimate is then allowed to converge to the actual network round-trip delay during operation. During the convergence period, or when network round-trip delay changes, the echo canceller and suppressor perform sub-optimally, resulting in echo that is audible to the call participants.
Further limitations and disadvantages of conventional and traditional approaches will become apparent to one of skill in the art, through comparison of such systems with aspects of the present invention as set forth in the remainder of the present application with reference to the drawings.