This invention relates to image resizing. More particularly, this invention relates to image resizing in which ringing artifacts are suppressed.
An image is a stored description of a graphic picture (e.g., video, text, etc.) and is often described as a set of pixels having brightness and color values. A pixel (i.e., a picture element) is generally one spot in a grid of spots that form the image.
Resizing an image involves an alteration in the pixel representation of that image. For example, to reduce the size of an image, fewer pixels are used. To enlarge the size of an image, more pixels are used. Image resizing may also include altering pixel brightness and color values. Resizing a digital representation of an image may be accomplished using a digital filter to change the sampling density of the image. Thus, resizing and “re-sampling” can be considered the same process.
A signal can be more efficiently re-sampled in the digital domain rather than in the analog domain, which involves converting the digital representation of an image into the analog domain, filtering, and then converting back to the digital domain. An input sampling grid can be used to describe the positions of the samples or pixels. It is commonly assumed that {0,0} is the {x,y} coordinate of the top left pixel of an image. An increase in an x-axis value indicates a move to the right in the image, while an increase in a y-axis value indicates a move down the image. While other coordinate systems may also be valid, historically, this system follows the standard scan order of a television screen, computer monitor, or other comparable display. A particular output pixel can be generated at a certain position in the input grid. For example, position {5.37,11.04} means 37/100 of the distance from column 5 (left) to column 6 (right) in the x-axis, and 4/100 of the distance from row 11 (above) to row 12 (below) in the y-axis. The position is said to have an input pixel index part (i.e., an integer index less than or equal to 5.37 (e.g., 5)), and a fractional part (e.g., 37/100) corresponding to the fraction of the distance from the integer to the next pixel index along an axis.
Re-sampling two-dimensional image data can often be simplified by performing axis-separable processing. In other words, re-sampling can occur along each axis independently of the other. If high quality re-sampling is performed along each axis, then the overall re-sampling quality after combining processing from both axes should also be of high quality. In hardware, each re-sampling implementation can be based on processing pipelines, so a separate re-sampling system can be used for each axis for better performance. In software, one subroutine may be able to serve both axes.
As is well known in the art, ringing artifacts may occur in linear filtering operations that attempt to maintain frequency response up to a finite level. Such a filter applied to an edge causes ringing near either side of the edge on the output. In both cases, ringing artifacts are visible on a filtered image as an intensity rippling. Intensity rippling is a variation in the intensity of a displayed image as a function of the distance from the feature causing the ringing. Intensity rippling is most visible around features such as impulses or transitions (i.e., steps from one level to another). Impulses can be described as a jump or spike with respect to neighboring input samples where the width of the spike is extremely narrow.
The requirements for re-sampling of text and graphics images is different than for video images. In the former case, sharp edges and the absence of ringing artifacts on transitions are important.
In contrast, for some images, the preservation of spectral content in some regions is important. For other regions, behavior closer to that of graphics and text is ideal. For example, an image may consist of single sine-wave components at each position with uniform and smooth frequency changes as a function of position (such as found in typical “zone-plate” test patterns, which are video test signals in which the spatial frequencies are a smooth function of {x,y} position). Spectral preservation only works when re-sampling such images. In contrast, transitions between objects are not spectral in nature, and a text-like interpretation may be more appropriate.
When digitally processing an image, it is often desirable to be able to reduce the size of images without introducing artifacts in the reduced image. Artifacts are image content that visibly alter the appearance of the original image. Artifacts can be present in a variety of image types. Ideally, a pleasing image appearance should be maintained when reducing an image (i.e., decimating the pixel data), even though some artifacts may still occur.
Decimation filters are used for image reduction and are usually designed from a spectral perspective. However, filters designed spectrally require many taps that are very expensive to implement. Moreover, filters with negative coefficients are obtained, which may produce ringing artifacts on sharper transitions. As is well known in the art, better spectral re-sampling is possible when more input points are used to create each output sample. The resulting longer FIR (finite impulse response) filters, however, require more computation, which can increase costs or restrict throughput (output pixels per second generated).
It is possible to create low artifact resampled images, using only four samples of history in each axis. The cubic model is obtained from the gradients (e.g., g(0) and g(1)) of the sampled signal co-sited respectively with the inner two of the four input samples (e.g., f(−1), f(0), f(1), f(2)). The gradients are calculated from the weighted sum of neighboring input samples. Co-sited refers to two corresponding values that have the same index or are at the same position. For example, f(0) can be co-sited with g(0) (gradient g(0) is calculated using f(0) and its neighboring input samples), and similarly, f(1) can be co-sited with g(1)). The four known values (e.g., f(0), g(0), f(1), g(1)) are then used to obtain the cubic coefficients. The two inner input samples will surround the generated output sample, and the gradients each correspond to one of the two corresponding inner input samples. From the estimated gradients, a model is created which is then used to calculate the output values of the resized image at a fractional position.
Up-sampling (i.e., enlarging) images also attempts to sharpen those images in order to make up for deficiencies in the high frequency responses of up-sampling filters. When sharpening an image, maintaining zone-plate frequency response and transition quality is important. The proposed four-sample approach above may encounter difficulty in maintaining image sharpness if the four sample points provided introduce a more complex feature (e.g., the four points do not represent a single sine wave).
In view of the foregoing, it would be desirable to provide an economical approach for effectively detecting and suppressing ringing artifacts in an image resizing process.
It would also be desirable to provide improved image sharpening when up-sampling an image in an image resizing process.