Teleconferences are generally conducted over packet networks. Voice transmission over packet networks is subject to packet loss and delay variation, the latter of which is commonly known as “jitter.” In Internet Protocol (IP)-based networks, a fixed delay can be attributed to algorithmic, processing and propagation delays due to material and distance, whereas a variable delay may be caused by the fluctuation of IP network traffic, different transmission paths over the Internet, etc. VoIP (voice over Internet Protocol) receivers generally rely on a “jitter buffer” to counter the negative impact of jitter. By introducing an additional delay between the time a packet of audio data is received and the time that the packet is reproduced, a jitter buffer aims at transforming the uneven flow of arriving packets into a regular flow of packets, such that delay variations will not cause perceptual sound quality degradation to the end users.
Maintaining adequate acoustic quality during teleconferences also can be challenging. Teleconference participants may be in a variety of environments and may use fixed endpoint terminals or mobile endpoint terminals, such as cellular telephones, smart phones, etc. Some environments may have a substantial amount of background noise, the intensity of which may vary over time. The environments also may cause acoustic echoes. It would be desirable to have improved methods and devices for monitoring voice quality during teleconferences.