Digital images are subject to a wide variety of distortions during acquisition, processing, compression, storage, transmission and reproduction, any of which may result in a degradation of visual quality. For applications in which images are ultimately to be viewed by human beings, the most reliable method of quantifying visual image quality is through subjective evaluation. In practice, however, subjective evaluation is usually too inconvenient, time-consuming and expensive.
Objective image quality metrics may predict perceived image quality automatically. The simplest and most widely used quality metric is the mean squared error (MSE), computed by averaging the squared intensity differences of distorted and reference image pixels, along with the related quantity of peak signal-to-noise ratio (PSNR). But they are found to be poorly matched to perceived visual quality. In the past decades, a great deal of effort has gone into the development of advanced quality assessment methods, among which the structural similarity (SSIM) index achieves an excellent trade-off between complexity and quality prediction accuracy, and has become the most broadly recognized perceptual image/video quality measure by both academic researchers and industrial implementers.
In general, video coding often involves finding the best trade-off between data rate R and the allowed distortion D. Existing video coding techniques use the sum of absolute difference (SAD) or sum of square difference (SSD) as the model for distortion D, which have been widely criticized in the literature because of their poor correlation with perceptual image quality. There have also been attempts to define D based on SSIM, and develop rate-SSIM optimization methods for video coding.
Thus, what is needed is an improved solution which addresses the limitations as outlined above.