A key problem in the field of image processing is how to predict whether an observer will prefer one image over another. A key selling point of many imaging devices such as cameras, scanners, printers and displays is the subjective quality of the images that the devices are asserted to produce. As the technology in imaging devices advances, the images produced by these devices become larger and the computing resources required to process them increase, thus becoming more onerous particularly on low-power devices such as cameras and smart phones. When allocating resources to processing/storing/displaying or printing an image based on assumptions about the subjective quality of the image it would be advantageous to use a reliable method to quantify the increase in subjective quality (also referred to as observer preference or observer preference score) between the original image of interest (also referred to as the target image), and the transformed image expected to be obtained from processing the original image.
From a terminology perspective, a “preference measure” is a measure of the degree to which an image will be selected over another image (or other images) based on desirable characteristics. In the aforementioned example the desirable characteristic being measured, is “subjective quality”. The preference measure measures the degree to which one image is preferred to another based on subjective quality. The “value” of the preference measure is the quantitative value that is determined for the measure in question. This might for example be a numerical score of 6 in a possible range of 1-10. In this description the value of the preference measure in question is also referred to as the observer preference or the observer preference score.
There are many methods known in the prior art for predicting observer preference between two images where one has undergone a transform. Preference scores based on signal processing techniques such as Mean Square Error (MSE) and Peak Signal To Noise Ratio (PSNR) have been shown to have some ability to predict observer preference, but fail in many cases to make reliable predictions. There exist known methods that analyse the structural content of images and estimate observer preference based on the notion that observers prefer structural details to be preserved or enhanced in the transformed image. Other methods attempt to use machine learning to predict observer preference based on features extracted from either the original image, the transformed image or both images. Yet other methods attempt to partition the problem of estimating observer preference for an entire image into a sub-set of problems each attempting to estimate observer preference for a sub-aspect of the image. For example, sub-aspects of an image might include the relative quality of colour reproduction between the original and the transformed image, the quality of brightness reproduction, the quality of detail preservation and the presence or absence of artifacts in the transformed image. By merging the observer preference scores for each of the sub-aspects, a prediction of the overall observer preference score for the transformed image is obtained.
There also exist methods that attempt to quantify observer preference based on considerations of human cognition and vision. For example, recent methods have emerged that make use of salience and content masking information. Salience relates to image content within an image that attracts a user's attention. Some methods for producing salience maps represent salience information in the form of a “heat map” that indicates the regions of an image that are likely to be more closely scrutinized by observers. It is widely accepted that differences between an original and a transformed image have more of an effect on observer preference when these differences fall in salient regions of the image. Content masking (also termed “visual difference” in the prior art) technology takes advantage of the observation that not all changes to an image are noticeable by human observers. Changes to an image that are not noticeable are not likely to affect an observer preference for a transformed image. Content masking techniques produce a content masking map that indicates changes (ie differences) between the original and transformed images, ie changes arising from application of the transform to the target image, that observers will probably be physically able to see (ie differences that are visually perceptible) and also indicates how strong the differences will probably appear to the observer, ie the strength of the perceptibility. Stronger differences are more likely to affect observer preference than weaker differences. Recent methods to predict observer preference combine information extracted from salience maps with information from content masking maps to create a prediction of observer preference.
However, the prior art methods to quantify observer preference fail to perform well in many circumstances.