The increasing importance of live video services streamed over the internet has highlighted the need for methods that can accurately assess the quality of the video experienced by the end user. Network characteristics can have a significant impact on the video quality, such as packet loss, latency, and other effects of the network. Accurate quality assessment is essential in the design, provisioning and testing of video transmission systems. Sensible balancing of factors such as video resolution, encoder profile, encoded bit rate, latency, and error detection as well as error correction/recovery all depend on the understanding of the end-user experience and particularly the perceived video quality.
For services unable to utilise retransmission to mitigate the effects of network losses, packet loss impairment (PLI) can have a major impact on the perceived video quality experienced by the end-user. An example of such is an IP based broadcast video system, where video can only be sent once, and any packets lost during transmission have, to be dealt with without the benefit of retransmission.
Techniques used for PLI assessment are categorized as follows: a) full-reference (FR), where source and degraded video sequences are analysed; b) picture buffer no-reference (NR-P), where only the decoded picture is analysed; c) bitstream no-reference (NR-B), where only the bitstream prior to decoding is analysed; and d) hybrid no-reference (NR-H), where both the bitstream and decoded picture information is analysed. FR measures of mean squared error (MSE) and peak signal to noise ratio (PSNR) are popular for their convenient and tractable nature. Unfortunately, these measures have limited accuracy as indicators of perceived video quality. Improvements to these measures may be achieved through perceptually weighting the error signal according to expected visibility, where weighting factors are determined by subjective tests. FR structural similarity (SSIM) based image quality assessment techniques compare the structures (information and properties from the visual scene) of the reference and distorted signals. SSIM uses measures of change in structural information as an approximation to perceived image distortion.
The perceptual impact of PLI depends on factors such as position, size and duration of the error in the video, the sophistication of the recovery technique and the masking properties of the video. FR measures, such as SSIM and MSE, can be used to assess PLI effects on decoded video, but generally do not directly consider these PLI factors.
NR-P techniques tailored for PLI evaluation, such as slice boundary mismatch (SBM), attempt to measure PLI factors through modelling effects such as discontinuities in pixel rows. However, NR-P techniques suffer from not knowing the exact location of errors and having to rely on statistical models to discriminate between errored and unerrored portions of the picture. This can lead to inaccuracy due to misclassification of natural image variation as a possible error.
NR-H models typically use the errored bitstream to measure the error extent and have access to macroblock type and motion information to predict the limits of propagation of these errors. This error specific information may be used to enhance accuracy of FR and NR-P techniques.