The present invention relates to packet voice communication systems, and more particularly, to method and apparatus for enhancing the quality of service of packetized voice conversations over a Local Area Network (LAN), when the Ethernet standard is used as an access mechanism.
Originally, local area networks were designed for interconnecting data terminals, such as work stations and servers, for the transmission of data. Increasingly, however, local area networks are utilized to transmit voice signals between voice stations that are connected to the LAN using packet phone adapters (PPAs). In the packet-based environment of a LAN, audio information is transmitted in packets. The voice station associated with the calling party samples the voice of the speaker, converts the sampled voice signal from an analog to digital format, organizes the sampled digital signal into packets, implements compression techniques, if desired, and then transmits the signal over the Ethernet medium.
Local area networks based on the 802.3 IEEE standard use a Carrier Sense Multiple Access/Collision Detection (CSMA/CD) mechanism to enable multiple stations to share the same infrastructure, such as a shared segment or a hub. A station connected to the LAN and having a packet to send listens to the shared medium. If the channel is clear (there are no other stations transmitting), the station under consideration starts transmission, while listening at the same time. If the station detects a collision while transmitting, the station jams the medium, aborts the transmission, and reschedules the transmission for another time.
As a result of this collision detection access mechanism, packets can incur random and bursty transmission delays through the LAN, which can adversely affect the quality of the voice conversation. At the voice station associated with the called party, the voice packets are passed to a decoder. Generally, the decoder expects to receive packets from the network interface and plays them out as soon as they are received. However, due to the potential random delays, the decoder might not receive a packet at the appropriate time. In this situation, the packet is considered lost and the decoder might play out silence or interpolate a sample from two previous packets. Frequent packet losses result in poor voice quality.
A number of techniques have been proposed or suggested to compensate for such transmission delays and improve the quality of service in packet-based telephony. One such technique is the build-out delay technique where the receiver waits for an initial fixed period (the xe2x80x9cbuild-outxe2x80x9d delay) after receiving the first packet of a call before reconstructing and replaying the audio signal from the received packets. During reconstruction, the receiver can use sequence numbers (when available) from the received packets to synchronously schedule the received packets for play-out.
Unfortunately, the use of sequence numbers, by themselves, and a fixed build-out delay for the entire call does not eliminate the distortion in the packet voice system due to packet delays and excessive packet losses. Thus, it has been proposed to dynamically change the build-out delay throughout the duration of the call. For example, the build-out delay change can take place at the beginning of a talk-spurt (a talk-spurt is a sequence of audio packets between two silence durations).
It is important to determine the build-out delay that results in the desired packet loss. There is obviously a trade-off between delay and packet loss. In one extreme, the build-out delay can be set to a large value, allowing the receiver to accumulate all the voice packets and resulting in zero packet loss. However, a large build-out delay as such eliminates the interactive nature of the voice conversation. In the other extreme, the build-out delay can be set to zero resulting in a larger than desired packet loss. Accordingly, it is important to find the minimum value of the build-out delay that will result in the maximum acceptable packet loss. This is not an easy task, especially in a shared LAN (using the Ethernet standard as an access mechanism) where the random delays depend on many factors such as the number of stations connected to the LAN, the intensity and mix of the traffic generated by these stations, the size of packets, and the distance between the stations trying to communicate.
The techniques that have been proposed to compensate for the variable delays have been generic in nature and did not take into consideration the nature and the distribution of the variable delays. A need therefore exists for a technique that compensates for such transmission delays based on the delay distribution. A further need exists for a method and apparatus that dynamically improves the voice quality under varying traffic conditions in a LAN environment. Yet another need exists for a method and apparatus for dynamically adjusting a jitter buffer depending on network delays in a LAN environment.
Generally, a method and apparatus are disclosed for dynamically adapting the play-out delay for voice packets as a function of the estimated packet delays in a Local Area Network using the Ethernet standard as an access mechanism, to improve the quality of packetized voice conversations. The present invention minimizes both the end-to-end voice delay and loss of packets due to late arrivals relative to their play-out time.
The present invention recognizes that packet delays in a LAN follow a log-normal distribution, and it also recognizes that the delays do not exhibit correlation, and therefore the marginal delay distribution is adequate in representing the actual delays. The present invention provides an adaptive play-out process to estimate the distribution of the packet delays using a log-normal distribution and applies a dynamic mechanism to adapt the play-out delay to the varying traffic conditions on the network.
The log-normal distribution is characterized by its mean and variance, obtained by evaluating packet arrival delays. According to another aspect of the invention, the size of the play-out buffer and the resulting packet loss rate, can be determined to provide a desired quality of service. In an illustrative implementation, the size, B, of the play-out buffer is established to ensure that the packet loss does not exceed one percent (1%). This illustrative packet loss-rate of 1% is deemed to be acceptable for the majority of codecs. The distribution parameters, such as average delay and standard deviation, are continuously updated and the play-out buffer B, is modified dynamically according to the illustrative 99th percentile of the delay distribution.