1. Field of the Invention
A system and method are disclosed for the non-intrusive determination of the subjective quality of a packetized real-time media data stream in a packet-switching network without interrupting the flow of the media stream or adding a known test signal in the packet-switching network.
2. Description of the Related Art
In recent years, packet-switching networks have been used increasingly for transport of real-time media such as, for example, voice and video media signals that are transmitted either real-time or delayed. The signals begin as either analog or digital signals. A specific example is the increasing use of the Internet for carrying voice-over-internet protocol (VoIP) calls. In a VoIP call, a digitally encoded voice signal is packetized into an Internet Protocol (IP) packet stream that is transmitted over the Internet to a destination device. At the destination device, the digitally encoded voice signal is extracted from VoIP packet payloads in the packet stream and then decoded into a signal that is played-out in real-time to the user at the destination.
When a packetized real-time media stream is transmitted across a packet-switching network, the packet stream may be corrupted by a number of network impairments. Examples of network impairments include packet-discarding at routers due to packet bit errors, packet-dropping at interface buffers due to traffic congestion, packet-duplication, packet-delay in time beyond its hard or soft real-time deadline at its destination, packet-misrouting, and loss of packet-sequence. These impairments generally degrade the quality of the media signal that is eventually received at the destination.
Due to network impairments that may be encountered in the transmission of real-time media over a packet-switching network, it is important to be able to measure and monitor the quality-of-service (QoS) that is being provided by the network. Typical network QoS measures include, for example, end-to-end packet delay, end-to-end packet delay jitter, packet corruption, and packet loss. To monitor such network QoS measures, one can deploy commercially available monitoring systems.
Although measurement and monitoring of network QoS provides valuable information regarding the ability of a network to properly support transmission of real-time media signals, such measures do not directly reflect the media signal subjective quality that is actually perceived by an end-user. This is the case since subjective quality of a real-time media signal, as perceived by the end-user, is difficult to quantify in terms of the network QoS measures. To deal with this general problem, objective methods have been developed for estimating subjective quality of media signals. For example, perceptual speech quality measurement, PSQM, is a means for objectively assessing the quality of speech that has been degraded by a telephony network. It has a high correlation to subjective quality across a range of distortion types, and is used to test networks that are subject to different coding types and transmission errors. PSQM is used primarily to test networks that have speech compression, digital speech interpolation, and packetization. PSQM of this type has been recommended by the International Telecommunication Union in its telephone recommendations. An example of an objective method for voice signals is ITU-T recommendation P.861.
To estimate the subjective quality of a VoIP call that traverses a packet-switching network, FIG. 1 depicts the basic approach of PSQM that has been adopted. In an objective method a known artificial voice signal 10 is transmitted across the network 12. An artificial voice signal for use in PSQM may be stored in a commonly used file format such as, for example, a wave (.WAV) file. The process of transmitting a known signal to evaluate the degradation in quality after it has traversed a network may be termed an active method since a known test signal is actually injected into and transmitted across the network.
PSQM uses a psychoacoustic model 14 that aims to mimic the perception of sound in real life, and was originally developed to test compressor/decompressors (codecs). A Codec is a software component that translates video or audio between its uncompressed form and the compressed form in which it is stored. The algorithm functions by comparing signal 10 after it has been through a coder 16 and decoder 18.
PSQM provides an output 20 in the range of 0 to 6.5, where values close to 0 indicate very good speech quality, and values close to 6.5 indicate poor speech quality. At the destination, a quality measure or score (e.g. mean opinion score (MOS)) is computed 22 based on the received signal and the known artificial voice signal that was transmitted and is output 24. Although PSQM does not have a direct correlation to MOS, the subjective quality is nevertheless inferred from the objective quality. That is, if a person listens to a speech sample that has a PSQM value of 2, that person would think the quality was worse than a speech sample having a PSQM value of 1. PSQM values can be roughly translated in MOS values.
Accordingly, there is a need for an accurate subjective measurement algorithm for determining the mean opinion score and quality of VoIP packets.