Today new radio networks have enabled more bitrate-heavy services such as streamed multimedia (video) content and mobile TV. At the same time TV over Internet Protocol (IP) has become a popular service in fixed communication networks. Along with this development there has been a growing emphasis on real-time assessment of video quality for this kind of visual communication services. The methods for video quality assessment include subjective methods and objective methods. The subjective methods typically involve human assessors, who grade or score video quality based on their subjective feelings, and use the grades or scores obtained in such a subjective way for video quality assessment. The objective methods, on the other hand, do not involve human assessors and assess the video quality only by using information obtained from the video sequences.
The objective video quality assessment methods can be further classified into full-reference methods, reduced-reference methods, and no-reference methods. Full reference models are available on the market and, for example, include Perceptual Evaluation of Video Quality by OPTICOM, Optimacy tool from Genista Corporation and products from Psytechnics Ltd and National Telecommunications and Information Administration.
Both the full-reference methods and the reduced-reference methods need reference information about the original video (i.e. the video actually transmitted from the transmitting side) to conduct the video quality assessment and thus cannot be used for real-time in-service video quality assessment. On the other hand, the no-reference methods do not require the reference information of the original video. Instead, the no-reference methods make observations only on decoded video (i.e. the video that has been received and decoded on the receiving side) and estimate the video quality using only the observed information on the decoded video.
For a no-reference video quality assessment, two major sources of video quality decline should be taken into consideration. The first one is coding and compression of video sources and the second one is data packet loss during transmission, i.e. during the streaming of the video content. Another source of video quality decline may be so called packet jitter.
In an IP network, deterioration in perceived video quality is typically caused by data packet loss. Most packet losses result from congestions in network nodes as more and more packets are dropped off by routers in IP networks when congestion occurs and the severity increases. In case of a wireless communication network, poor radio conditions may cause packet loss. The effect of packet loss is a major problem for real-time video transmission (streaming video). The measurement of the video quality decline caused by packet loss during transmission is referred to as packet loss metric.
The streamed video is typically coded and compressed by using codecs such as, for example, H.263, MPEG-4, H.264 and VC-1, that utilize temporal predictive coding to improve coding efficiency. Three types of frames are then commonly used: a) intra frames (I-frames) that do not use temporal prediction and serves as a video refresh frame, b) predictive frames (P-frames) and c) bi-predictive frames (B-frames) that are predicted from one or more reference frames. Here, I-frames and P-frames usually act as reference frames, and if a part of a reference frame is lost an error resulting from the loss tends to propagate in time until the next I-frame (or P-frame) refreshes the video.
A number of prior methods for calculating video deterioration due to packet loss have been proposed, of which one is based on estimating a number of lost macro-blocks for each frame type of a video stream. Another technique extracts spatial distortion of each image in a video stream using differences between corresponding regions of two adjacent frames in the video sequence. The spatial distortion is weighted based on temporal activities of the video, and the video quality is measured by detecting the spatial distortions of all images in the sequence.
However, the aforementioned methods for calculating video deterioration needs to process all the blocks in the image frames, which means that those methods are very computational intensive and are not optimal for use in many real time video transmissions applications.