Traditional circuit-switched communication networks have provided a variety of voice services to end users for many years. A recent trend delivers these voice services and other services, such as video and data, using networks that communicate information in packets. These packet-switched networks allow dynamic bandwidth and can be connectionless networks with no dedicated path or connection-oriented networks with virtual circuits having dedicated bandwidth along a predetermined path. Because packet-switched networks allow traffic from multiple users to share communication links, these networks use available bandwidth more efficiently than circuit-switched networks.
An Internet Protocol (“IP”) network is an example of a connectionless packet-switched network that breaks up data streams, such as voice, video, or data, into addressable packets. Each IP packet includes source and destination addresses and traverses any available route between the source and destination. The IP packets are transmitted independently and then reassembled in the proper sequence at the destination.
For voice traffic, packets are fomatted and transmitted using the voice over IP (“VoIP”) protocol. Unlike synchronous strata clock schemes in traditional circuit-switched networks, VoIP schemes use independent, free-running clocks for analog-to-digital and digital-to-analog conversions at the source and destination of a voice call. During a voice call, this clock independence, given enough time, eventually causes either a build-up of packets or a starvation of packets. Either condition severely degrades quality of service (“QoS”) of VoIP data streams.
To enhance QoS for a VoIP connection, voice activity detection (“VAD”) and comfort noise generation (“CNG”) schemes have traditionally measured speech energy at the transmitting side, deciding whether or not to send packets to the receiving end based on a speech/no-speech decision. The receiving end has traditionally used the null time period in between speech utterances to adjust for time base discrepancies between send and receive. In addition, the receiving side provided some form of CNG during silent periods to keep the user from thinking the line has dropped. These schemes, however, are problematic with level and spectral mismatches that are created by user adjustments and that lower quality on that call.