In order to ensure a high degree of satisfaction for the user of video services such as non-interactive streaming video (IPTV, VoD), the perceived video quality of those services needs to be estimated. It is a major responsibility of the broadcast provider towards both content provider and customer to maintain the quality of its service. In large IPTV networks, only fully automated quality monitoring probes can fulfill this requirement.
To this end, video quality models are developed which provide estimates of the video quality as perceived by the user. Those models can, for instance, output the degree of similarity between the video received at the user's end and the original non-degraded video. In addition, and in a more sophisticated manner, the Human Visual System (HVS) can be modelled. At last, the model output can be mapped to the results of extensive subjective quality tests, to ultimately provide an estimation of perceived quality.
Video quality models and thus measurement systems are generally classified as follows:
Quality Model Types
                Full Reference (FR): a reference signal is required.        Reduced-Reference (RR): partial information extracted from the source signal is required.        No-Reference (NR): no reference signal is required.Input Parameters Types        signal/media-based: the decoded image (pixel-information) is required.        parameter-based: bitstream-level information is required. Information can range from packet-header information, requiring parsing of the packet-headers, parsing of the bitstream including payload, that is coding information, and partial or full decoding of the bitstream.Type of Application        Network Planning: the model or measurement system is used before the implementation of the network in order to plan the best possible implementation.        
Service Monitoring: the model is used during service operation.
Related Information of the Types of Video Quality Models can be Found in References [1-3].
Several packet-based parametric video quality models have been described in the literature [4-6]. However, a major drawback of these models is that they do not take into account the quality impact of the content. In other terms, and as reported in previous studies [7-12], the perceived video quality depends on the spatio-temporal characteristics of the video. For instance, packet-loss is generally better concealed when there is no complex movement in the video, such as in broadcasting news. When there is no packet-loss and for low and medium bitrates, content with low spatio-temporal complexity achieves better quality than spatio-temporally complex content.
Further publications also aim at including the quality impact of the content into a parameter-based parametric video quality models, for both packet-loss and no-packet-loss cases, cf. Refs. [13a, 13b, 14, 15, 16].
For instance, in Refs. [13a, 13b, 14], the complexity of the contents is determined per video frame by comparing the current frame size with an adaptive threshold. Whether the current frame size is above, equal to or below this threshold will result in increasing or decreasing the estimated quality associated with the current frame. However, due to the use of a threshold value and the resulting three possibilities of being greater, equal or lower than this value, the method disclosed in these references only provides a relatively coarse consideration of the video content. In other words, there is no smooth or continuous measurement of the complexity of the frames within a given measurement window. Moreover, since the adaptive threshold is computed over the complete or part of the measurement window, the complexity of each frame is determined relative to the complexity of other frames in the same video sequence, but not relative to the complexity of other contents.
In Ref. [15], a solution is proposed for inserting content-related parameters, i.e. parameters which reflect the spatio-temporal complexity of the content such as quantization parameter and motion vectors, into a parameter-based video quality model. However, these content-related parameters cannot be extracted from an encrypted bitstream, so that Ref. [15] cannot be used in the same way as the present invention.
Ref. [16] presents a solution for estimating the perceived video quality in case of packet loss with a single parameter, which represents the magnitude of the signal degradation due to packet loss. This solution foresees the inclusion of a correction-factor for adjusting the estimated magnitude of the signal degradation based on the temporal or spatio-temporal complexity of the content. However, no solution is proposed for computing this correcting factor, for example in case of encrypted video.
Consequently, there is still a need for a method for estimating the perceived quality of a digital video signal. On the one hand, such a method should allow for a rather fine-grained consideration of the quality impact of the content of the video signal, and on the other hand it should also be applicable for encrypted video, including both the case of coding degradation with and without packet-loss. There is likewise a need for an apparatus configured for performing a method with these features.