In computer graphics, image scaling is the process of resizing a digital image. Given the proliferation of images and the variety of available types of media (mobile phones, PDAs, printers, packaging, etc) available for display, image resizing occurs frequently.
Historically, cropping and scaling (downsampling) have been used to shrink images and upsampling has been used to enlarge images. Cropping works reasonably well for shrinking images if there is only one region of interest in the image. Scaling works reasonably well for shrinking images containing low frequency information. However, scaling is of limited value because the scaling, and hence loss of image information, is applied uniformly to the image. With proper region identification, cropping may be preferred over naïve scaling in applications such as generation of thumbnail images because the resultant images are more recognizable. Naïve cropping can be problematic because contextual information that is important to the viewer may be cropped away.
Image scaling is a non-trivial process that involves a trade-off between efficiency, smoothness and sharpness. As the size of an image is increased, the pixels which comprise the image become increasingly visible, making the image appear soft. Apart from fitting a smaller display area, image size is most commonly decreased in order to produce thumbnails. Enlarging an image is less common because, in zooming an image, it may not be possible to discover any more information in the image than which already exists and image quality tends to suffer.
Classical methods for image resizing, such as cropping and scaling, do not take into account the content of the image to be resized. Such methods are prone to distorting content that may be important to the viewer. In order to preserve regions of the image which may be visually important to the viewer while eliminating the less important ones, image resizing techniques need to be more content aware. The few existing methods that attempt content-based image resizing are typically based on geometric operators which seek to preserve key geometric components of the image. While geometry is important, image entropy also plays a key role in our perception of image content. Many content-aware resizing schemes based on geometric operators alone tend to ignore this important aspect.
The influence that a distortion has on overall picture quality is known to be strongly influenced by its location with respect to scene content. Knowledge of a scene is obtained through regular eye movements to reposition the area under foveal view. Early vision models assume an “infinite fovea”, i.e., the scene is processed under the assumption that all areas are viewed by the high acuity fovea. However studies of eye movements indicate that viewers do not foveate all areas in a scene equally. Instead a few areas are identified as regions of interest (ROIs) by human visual attention processes and viewers tend to repeatedly return to these ROIs rather than other areas that have not yet been foveated. The fidelity of the picture in these ROIs is known to have the strongest influence on overall picture quality. The knowledge of human visual attention and eye movements, coupled with selective and correlated eye movement patterns of subjects when viewing natural scenes, provides a framework for the development of computational models of human visual attention. Techniques for determining visually important areas in an image use importance maps. Importance maps are generated by combining factors known to influence human visual attention and eye movements. A commonly used technique to build an importance map is to realize a gradient map of the image using a gradient operator. The magnitude of the gradient is a popular measure of local image geometry. See: Digital Image Processing, Gonzalez and Woods, Prentice Hall, p. 425 (2002).
Accordingly, what is needed in this art are increasingly sophisticated methods for digital image resizing which combine importance maps generated by different operators to take advantage of various quality characteristics produced by the differing operators in an image processing or document reproduction environment.