This invention pertains generally to digital image processing, and more particularly to a digital image downsampling method and apparatus.
Digital images represent an input scene as a collection of pixel values, typically laid out in a grid of rows and columns. Each pixel value in the image grid represents the greyshade or color of the image sampled at that location, with a variety of pixel formats possible.
Variations in intensity across an image can be described in terms of their spatial frequency, a measure of how quickly intensity variations occur as one traverses an image. The bar graph of FIG. 1 shows a hypothetical example of pixel intensity values for one row of an image. The predominant variation in intensity from column 0 to column 17 is a low frequency variation of about one cycle every 15 pixels. From column 18 to column 38, the predominant variation is much higher, about one cycle in every four pixels. Finally, columns 39 through 44 show spatial frequencies near one cycle every two pixels, the highest possible spatial frequency that can be represented on the digitally-sampled image. A 0.5 cycles/pixel spatial frequency is most often observed in computer-generated graphics with sharp edges, such as text.
Digital image display devices come in a variety of spatial resolutions. For instance, a super video graphics array (SVGA) display format on a personal computer uses a grid 800 pixels wide by 600 pixels high. And an extended graphics array (XGA) display format uses a grid 1024 pixels wide by 768 pixels high.
When an image created at one spatial resolution must be displayed on a device of a lower resolution, the image may be too large for the low-resolution device. In such a case, two display choices are available.
With the first display choice, a user can choose to view a portion of the image. For instance, a user could view 61% of the pixel area from an XGA-size image on an SVGA-size display. To view the remainder of the pixels, the user would have to scroll the image both horizontally and vertically, causing some of the original pixels to scroll off screen. This method is generally objectionable. To read text, a user would have to shift the image back and forth as they read each line of text. And many graphics lose their impact if they can only be viewed piecemeal.
The second display choice is to downsample the image. Downsampling produces an image that fits a desired resolution by removing pixels from the original image. For instance, a 1024xc3x97768 XGA image can be downsampled to an 800xc3x97600 SVGA image by removing 224 of the image columns and 168 of the image rows. The advantage of this method is that the entire image can be viewed at once. But the disadvantage is that removing rows and columns of the original image throws away information.
Perhaps the simplest method of downsampling is to just delete rows and columns with a spacing determined by the downsample ratio. With such a method, the number of rows that need to be removed, and the spacing needed between them, are computed. A first row is deleted, and then rows separated from the first row by multiples of the deletion spacing are likewise deleted. The same algorithm then proceeds to remove columns in like fashion.
Although the deletion method produces acceptable results for low-detail imagery, it may render highly detailed imagery, particularly graphics, virtually unintelligible. Consider the case of downsampling the image row of FIG. 1 by a factor of two, by deleting the even-numbered rows, to produce the image row of FIG. 2 (with pixels shown twice as wide to match the width of FIG. 1). In the low-frequency portion of the original image row (columns 0-17), deletion preserved the overall shape of the intensity variation. In the medium-frequency portion (columns 18-38), deletion removed most of the intensity variation. And in the high frequency portion of the row (columns 39-44), the intensity variation was obliterated by deletion. The poor results on columns 18-44 are due to aliasing, i.e., artifacts resulting from sampling material containing spatial frequencies too high for the desired sample rate.
This aliasing effect can be observed on actual imagery. FIG. 5 shows a snapshot image of a computer-generated graphics window. FIG. 6 shows the same snapshot image after deletion downsampling by a factor of 0.781 (corresponding to an XGA to SVGA conversion). Much of the text in the image has been rendered unintelligible due to aliasing (for instance, the xe2x80x9c1xe2x80x9d has been dropped from the word xe2x80x9cFilexe2x80x9d near the upper left hand corner, leaving the non-word xe2x80x9cFiexe2x80x9d in its place.
Aliasing effects can be reduced in downsampled imagery by pre-filtering the image to reduce high-frequency content. The image row of FIG. 3 was produced by averaging pixels from FIG. 1 in pairs, and replacing both pixels with a single pixel containing their average. Note that this method kept some of the medium-spatial-frequency energy intact. But the high-frequency light-to-dark variations were merged into a single pulse.
On actual imagery, pre-filtering produces a blurred or xe2x80x9cfuzzyxe2x80x9d appearance around high-frequency areas such as text. FIG. 7 shows the image of FIG. 5 after downsampling by a factor of 0.781, using an interpolation filter. Although the content of the text can generally be made out, the text""s blurriness strains the eyes.
Prior art downsampling methods were generally designed to work with adequately sampled images of natural scenes. Such images generally contain few areas of high-frequency information and are amenable to blurring-downsampling operations. Computer graphics, unlike natural scenes, may convey most of their information as high-frequency information. And unlike digitized images of natural scenes, downsampling of computer graphics can often be avoided by recreating the graphical image at a different resolution. But if a computer graphics image cannot be recreated at the new resolution for some reason (e.g., generating program not available, two output devices for the same image), downsampling must be employed. Prior art downsampling methods generally produce visually unacceptable results on computer-graphics and/or high-frequency-content images.
The present invention describes image downsampling with preservation of high-spatial-frequency information. Many images that contain graphics have both high-frequency and low-frequency regions, e.g., text and background. The high-frequency areas of graphics often contain redundant pixelsxe2x80x94those with many neighboring pixels of the same intensity or color. The present invention seeks to maintain high-frequency regions by downsampling the image non-uniformly, with a preference for avoiding downsampling in high-frequency image regions. When a high-frequency region must be downsampled, the high-frequency region""s redundant pixels are preferably removed. The non-uniform downsampling preserves high-spatial-frequency regions with minimal aliasing and without introducing blur.
In one aspect of the invention, a method of downsampling a digital image is disclosed. Expressed most succinctly, the method comprises selecting a deletion path through an image using a deletion path metric that favors a path through low relative spatial frequency areas of an image, and then deleting the pixels lying along the deletion path. If some pixels along a selected path have relatively high spatial frequency, that region of the image may optionally be low-pass filtered prior to pixel deletion.
Expressed in more detail, the method comprises the following steps. The method calculates spatial frequency for groups of adjacent pixels on the digital image. The method then uses spatial frequency to create scores for different potential deletion paths through the digital image. A deletion path having the most favorable score is selected from among the potential paths, and pixels along that path are deleted. Preferably, the method recurses until a desired downsampling is reached.
Several factors preferably contribute to a potential deletion path""s score. First, the absence of high spatial frequency edges along and parallel to a path weighs heavily in favor of that path. The relative absence of high spatial frequency edge crossings orthogonal to the path also favors that path. Straight paths are also generally favored over crooked paths. And path proximity to a desired path location may also favor a path.
In another aspect of the invention, a system for downsampling a digital image is disclosed. The system comprises a spatial frequency estimator that accepts a digital image as input and computes spatial frequency estimates for groups of adjacent pixels on the digital image. It also comprises a path generator that generates potential pixel deletion paths through the image. A path scorer calculates path scores for potential pixel deletion paths based at least in part of spatial frequency estimates, and a pixel remover reduces the pixel dimensions of the digital image by selecting, from among the potential deletion paths, a deletion path having the most desirable deletion path metric and removing pixels lying along that path. The system may downsample without recursion, or it may delete a single path at a time and recurse. The system may downsample horizontally, vertically, or both horizontally and vertically.