Video is a common media service nowadays and is rapidly gaining popularity among end users. Video delivered over a communication network as a stream of bits can experience communication constraints or problems which can lead to errors such as data loss or data delay. Such errors can cause a degradation in the visual quality experienced by the end users. Often, such visual degradation is revealed as frozen or distorted images.
Quality monitoring is an important operational scheme for service providers to maintain satisfactory Quality of Experience (QoE) of video services. To this end, good techniques are required to accurately and timely assess the video quality, which is by no means a trivial task.
Perhaps the most accurate video quality assessment technique is subjective testing, where opinions of numerous end viewers are collected and based upon which a common view of the quality is formed. However, subjective testing is expensive, manual, time consuming, and in many cases simply not feasible; for example, it is not possible to conduct subjective testing for video quality monitoring in communication networks or set-top boxes. Hence, means must be provided to estimate the subjective quality of video based on objective measurement. This demand has pushed the development of objective video quality assessment techniques. Objective video quality estimation can not only replace subjective testing in assessing video quality, but it can also enable real-time and automatic assessment of subjective quality.
In the context of the present disclosure, words like “estimate”, “assess”, “measure” are used interchangeably.
In recent years, different models of objective video quality techniques have been proposed and developed. Based the input parameters used, they can be categorized as perceptual models and parametric models. Parametric models are also known as network layer models.
Perceptual models take the decoded video and sometimes also a reference video as input to the model. For example, for a so called full-reference (FR) model, the input parameters are the pixel data of the source video as well as those of the decoded video. Perceptual models usually feature high accuracy in estimating the video quality, but they are highly complex, time consuming and computational power consuming, which makes them unsuitable for many situations, such as real-time monitoring. Moreover, when a reference video is used in a perceptual model, the model must strive to solve the crucial problem of synchronizing the reference video with the decoded video.
The interest in light-weight video quality models has lead to the recent development of parametric models. A parametric model takes packet header information as input and calculates a quality score based thereon. The network protocol header information is usually taken from one or more of Internet Protocol (IP), User Datagram Protocol (UDP), Real-time Transport Protocol (RTP) and Moving Picture Experts Group 2 Transport Stream (MPEG2TS) layers. Parametric models are relatively uncomplicated in comparison with perceptual models, but the estimation accuracy they can deliver is rather low due to the limited information that can be retrieved from the network protocol headers. For instance, it is very difficult to make a useful estimation of how visible a data loss will be in the decoded video.