1. Field of the Invention
The invention relates to estimating the quality of images, and more particularly to estimating the quality of images of image sequences, especially video sequences, that have been compressed digitally with a loss of information.
2. Discussion of the Background
During the last years various approaches have been developed to measure the quality of compressed pictures and video sequences. The outcome of this research was a variety of different algorithms, some of which are capable of determining a quality value with a high correlation compared to the perceived quality. This could be shown by comparisons to the results of subjective tests. Basically, those algorithms can be divided into three groups.
A first group compares a distorted video signal to an undistorted reference signal. From the two signals the perceived difference is calculated using a model of the human visual system. This type of model determines a “quality map” on a per pixel basis or to accumulate the error to achieve one quality value for a whole picture. The peak signal to noise ratio is one example of such an algorithm, applying a very simple model of the human visual system. Other popular algorithms are the so-called Sarnoff Model, which calculates an error signal in units of “just noticeable differences” (JND), and the Structural Similarity Analysis. Algorithms that use the undistorted signal as a reference are referred to as full-reference algorithms. However, they are applicable only to a few scenarios because in most applications there is no undistorted (uncompressed) reference signal available.
In order to reduce the amount of data needed for the calculation some algorithms extract features from the reference signal and the distorted signal. Thus, for a measurement of the picture quality only these features have to be considered instead of the whole reference sequence. These so-called reduced-reference algorithms can be used to evaluate picture quality at the receiver side of a transmission. However, their use requires an implementation of parts of the algorithm at the signal source and the transmission of side information in the data stream. Although it is technically feasible to implement such a system, it is desirable to determine the picture quality without having to extract information at the signal source.
A third group of algorithms tries to determine the picture quality only from the distorted picture and from side information present in the data stream. Most of these so-called no-reference algorithms are based on a detection of well-known coding artifacts. The blocking artifacts which are introduced by transform coders are used as a quality indicator. These artifacts appear periodically in the image and thus are detectable, e.g. via a statistical analysis of an edge image at block boundaries. The majority of all no-reference algorithms is based on the detection of blocking artifacts. Blocking indicators for example work well for MPEG-2 coded sequences. When encoding H.264-coded sequences the blocking artifacts are reduced by a deblocking filter which is part of the standard. Therefore, a detection of blocking artifacts does not lead to feasible results.
Another approach is the estimation of the coding error from the coded video sequence. In A. Ichigaya, M. Kurozumi, N. Hara, Y. Nishida, E. Nakasu, “A method of estimating coding PSNR using quantized DCT coefficients”, IEEE Trans. Circuits and Systems for Video Technology, Vol. 16, No. 2, pp. 251-259, February 2006, an algorithm for the estimation of the coding peak signal to noise ratio of MPEG-2 coded sequences is introduced where error variance is calculated from transform coefficients. For this calculation the distribution of the transform coefficients is estimated. Based on these distributions the error variance is calculated. However, this approach is limited to a DCT (discreet cosine transform) as transformation function. The accuracy of the quality estimates needs improvement.