a. Field of the Invention
The present invention relates to the problem of estimating the effect of packet transmission impairments, including packet loss, on the subjective quality of a video transmission where frames of data relating to the same video frame or field are permitted to span more than one packet.
The invention has particular application in a class of multimedia quality prediction models that predict the effect of packet transmission impairments on the perceived quality of a media stream.
In “VoIP Quality Assessment: Taking Account of the Edge-Device”, S R Broom, IEEE Trans on Audio, Speech and Language Processing, Vol 14, No. 6, November 2006, pp 1977-1983, Broom describes a model that predicts the effect of packet transmission impairments on the perceived, or subjective, quality of a voice over internet protocol (VoIP) call. The prediction is based on passive analysis of the packet stream carrying the voice data and can be performed at multiple locations in the network and without interference to the traffic. This type of measurement is referred to as passive or non-intrusive because it does not require a special test signal to be injected into the link being monitored and can be used on live traffic. The model is based on a set of parameters that are derived from the packet stream and which are combined to form a prediction of the voice quality. A process called calibration can be used to optimise the parameter combination for a particular VoIP endpoint, or a generic combination can be derived. The calibration process is based on large numbers of simulated calls made through the endpoint being calibrated, and uses an active or intrusive voice quality measurement algorithm such as ITU-T P.862 (PESQ) to measure their quality.
The general architecture described by Broom has been extended to predict the effect of packet transmission impairments on the perceived quality of a video transmission. The calibration process is very similar to the VoIP case, but uses an active or intrusive video quality model rather than P.862. Some of the model parameters are the same as those in the VoIP model, for example mean packet loss and mean packet delay variation titter); others have been developed to specifically address the problem of measuring video quality.
The present invention provides a degradation parameter derived from packet loss measurements that provides good correlation with subjective video quality and therefore has application in video models such as that described above.
When trying to accurately assess video quality degradations due to packet loss, many issues appear, especially in systems where frames of data relating to the same video frame or field are permitted to span more than one packet. Factors that can influence the importance of a lost packet include:                The type of video frame that is subject to packet loss, e.g. intra, predicted, bi-directionally predicted        The distance between Intra-frames        Position of packet loss in the frame        Any packet loss concealment algorithm implemented in the video endpoint        
The problem is to accurately model the effects of packet loss on perceived video quality and to correctly take into account the factors mentioned above in a simple and generic way that can be applied on any type of packet video transmission.
b. Related Art
In “MPEG video streamed over an IP-based network with packet loss”, Neve et al, 4th FTW PHD Symposium, Interactive poster session, paper nr. 29, Gent, Belgium, Dec. 3, 2003, the authors observe that the quality impairments produced by packet loss are more pronounced at high bit-rates. It is suggested that this is due to the fact that as the video coding bit-rate increases, the data from each frame occupies a larger number of packets and is therefore more likely to suffer from a lost packet. However, this document does not propose a method to take the effect of packet loss on video quality into account.
In “Real-Time Monitoring of Video Quality in IP Networks”, Tao et al, NOSSDAV'05, June 13-14, Stevenson, Wash., USA, the authors model the effect of packet loss on video quality in dependence of the codec and packetisation. This document suggests two models (one for MPEG-2 Video and one for H.264). Both models take into account the length of the loss burst, the average number of slices (where a “slice” represents part of a video frame) per packet and the average number of packets per frame. However, some of these inputs require access to the video payload, such as the number of slices per packet, and many effects are not taken into account, including frame type, the position of packet loss within a frame and the behaviour of the video endpoint.
In “Modeling Packet-Loss Visibility in MPEG-2 Video”, IEEE Transactions on Multimedia. Vol. 8, No. 2, April 2006, Kanumuri et al, the authors describe a model for estimating the visibility of packet loss in MPEG-2 video. Again, most of the factors described that affect the visibility of errors are extracted from the video payload.
A key limitation of the quality prediction methods described in the prior art is that in order to take into account the factors listed in the problem statement they require access to elements of the video packet payload. However, payload encryption is becoming increasingly common in packet transmission systems, for example to protect the copyright of video content or to ensure the privacy of people using a video conferencing system. In such cases, the payload of the video packets cannot be used to make an accurate estimation of the perceived quality degradation due to packet loss.
In contrast to the prior art, the present invention takes some of the factors described previously into account without using the video payload, and exploits the fact that when even when encryption is used, the media transport protocol header (e.g. Real-time Transport Protocol (RTP), Real Data Transport (RDT) protocol or Motion Picture Experts Group—Transport Protocol (MPEG-TS)) is generally kept unencrypted.