This invention relates to image resizing. More particularly, this invention relates to finite impulse response (FIR) filters that provide good image resizing results with reduced processing.
An image is a stored description of a graphic picture (e.g., video, text, etc.) and is often described as a set of pixels having brightness and color values. A pixel (i.e., a picture element) is generally one spot in a grid of spots that form the image.
Image resizing changes the size of an image by altering the number of pixels in the representation. For example, to reduce the size of an image, fewer pixels are used. To enlarge the size of an image, more pixels are used. Image resizing may also include altering pixel brightness and color values. Resizing a digital representation of an image may be accomplished using a digital filter to change the sampling density of the image. Resizing and xe2x80x9cre-samplingxe2x80x9d can thus be considered the same process.
A signal can be more efficiently re-sampled in the digital domain rather than in the analog domain, which involves converting the digital representation of an image into the analog domain, filtering, and then converting back to the digital domain. An input sampling grid can be used to describe the positions of all samples or pixels. It is commonly assumed that {0,0} are the {x,y} coordinates of the top left pixel of an image. An increase in an x-axis value can indicate a move to the right in the image, while an increase in a y-axis value can indicate a move down the image. Other suitable interpretations are also valid. Historically, the order described follows the standard scan order of a television screen, computer monitor, or other suitable display. A particular output pixel can be generated at a certain position in the input grid. For example, position {5.37,11.04} means {fraction (37/100)} of the distance from column 5 (left) to column 6 (right) in the x-axis, and {fraction (4/100)} of the distance from row 11 (above) to row 12 (below) in the y-axis. The position is said to have an input pixel index part (i.e., an integer index less than or equal to 5.37 (e.g., 5)), and a fractional part (e.g., {fraction (37/100)}) corresponding to the fraction of the distance from the integer to the next pixel index along an axis.
Re-sampling systems can often be simplified by performing axis-separable processing. In other words, re-sampling can occur along each axis independently of the other. If high quality re-sampling occurs along each axis, then the overall re-sampling quality after combining processing from both axes should also be of high quality. In hardware, each re-sampling implementation can be based on processing pipelines, so a separate re-sampling system can be used for each axis for better performance. In memory-to-memory systems, the image can be split into tiles. Each tile can be processed with a separate re-sampler, with source and destination image memory bandwidth limiting processing speed. In software, one subroutine may be able to serve both axes.
Re-sampling of text and graphics images has different requirements than for video images. In the former cases, sharp edges and the absence of ringing artifacts on transitions are important.
In contrast, for some images, the preservation of spectral content in some regions is important. For other regions, behavior closer to that of graphics and text is ideal. For example, an image may consist of single sine-wave components at each position with uniform and smooth frequency changes as a function of position (such as found in typical xe2x80x9czone-platexe2x80x9d test patterns). Spectral preservation only works when re-sampling such images. In contrast, transitions between objects are not spectral in nature, and a text-like interpretation may be more appropriate.
Two basic approaches to re-sampling are possible. One approach uses different filters to calculate output pixels based on their position with respect to the input pixels. These so-called xe2x80x9cpoly-phasexe2x80x9d re-samplers require large filter coefficient storage, as well as many multipliers and other logic devices.
Another approach is to construct piece-wise continuous models of the waveform between each integer input sample position. Any number of output samples can be generated at any fractional position between them. This approach is generally more economical and can also be used with text and graphics inputs.
For up-sampling (i.e., image enlargement) in general, better results are obtained if each continuous model generates output sample values that go through the input samples (at their integer positions). In the piece-wise model case, the output models, when placed end-to-end and viewed as a continuous waveform, go through all the input sample points.
Cubic polynomial piece-wise models can be obtained by constraining the waveform to go through two input samples as long as the components of the gradients along the axis being processed at these two samples are also known. A gradient can be defined as an estimate of the rate of change, or slope, between two sample values. The gradients can be estimated using a finite impulse response (FIR) differentiating filter, which takes a set of weighted sums of a set of input samples to generate an output (gradient) sample. The number of samples used is referred to as its history, or number of taps. The number of taps and the relative weight of each tap determine the filter response. The set of input samples used to create each model have a fixed position relative to that model.
Along each axis, a re-sampling system can be designed to step through the input sample grid, calculating where each output sample lies, and then calculate the integer index and fractional values. The sampling system can then obtain the set of input samples to use and apply them to FIR filters to generate the two co-sited gradients at the two neighboring positions (i.e., index and index+1). The system may then calculate the cubic model and evaluate the model at the fractional position to obtain the output pixel value.
As is well known in the art, better spectral re-sampling is possible when more input points are used to create each output sample. The resulting longer FIR filters, however, require more computation, which can restrict throughput (output pixels per second generated) or increase costs.
At least four samples are needed for spectral separation. The cubic model is obtained from the gradients of the sample signal (e.g., g(0) and g(1)) co-sited respectively with the middle two of the four input samples (e.g., f(xe2x88x921), f(0), f(1), f(2)). The gradients are calculated from the weighted sum of neighboring input samples. Co-sited refers to two corresponding values that have the same index or are at the same position. For example, f(0) can be co-sited with g(0) (gradient g(0) is calculated using f(0) and its neighboring input samples) and similarly, f(1) can be co-sited with g(1)). The four known values (e.g., f(0), g(0), f(1), g(1)) are then used to obtain the cubic coefficients. The two middle input samples will surround the generated output sample, and the gradients each correspond to one of the two corresponding middle input samples. From the estimated gradients, a model is created which is then used to calculate the output values of the resized image at a fractional position.
In view of the foregoing, it would be desirable to provide an improved cubic piece-wise continuous model approach with economical re-sampling of spectral image content.
It would also be desirable to provide a FIR filter requiring only four input samples for spectral separation having a good differentiating frequency response.
It would further be desirable to provide a FIR filter having good edge and narrow peak handling characteristics that provides good resized image qualities.
It is an object of this invention to provide an improved cubic piece-wise continuous model approach with economical re-sampling of spectral image content.
It is another object of the invention to provide a FIR filter requiring only four input samples for spectral separation having a good differentiating frequency response.
It is a further object of the invention to provide a FIR filter having good edge and narrow peak handling characteristics that provides good resized image qualities.
In accordance with this invention, an image resizer is presented that includes several processing stages. Image resizing involves a linear process (e.g., re-sampling and filtering are both linear) in which signals or samples add linearly. Stage one is gamma modification, which involves removing video gamma correction that was previously applied to an image and reapplying a new gamma. Gamma is a measure of contrast in an image. Different gamma values display different intensity pixels on a screen. In a gamma-corrected linear luminance domain (Yxe2x80x2), black narrow features are linear (or additive) on a white background for a gamma xcex3 equal to about 2.5. In a linear luminance domain (Y), white narrow features are linear (or additive) on a black background for a gamma xcex3 equal to about 1. The two domains Y and Yxe2x80x2 are related by Y=(Yxe2x80x2)xcex3 or Yxe2x80x2=(y)1/xcex3. Although an image may already have gamma correction (Yxe2x80x2) applied to it, decimating such an image may create undesirable effects on extreme narrow bright impulses. Impulses can be described as a jump or spike with respect to neighboring input samples where the width of the spike is extremely narrow. On the other hand, removing the gamma (to work in the Y domain) may create undesirable effects on extreme narrow dark impulses. As a compromise, a xcex3 of about 1.6 is preferably applied to the signal prior to resizing (e.g., Y1.6=(Yxe2x80x2)xcex3/1.6) because it averages the undesirable effects of the narrow white and narrow dark features of the luminance domains.
Stage two involves filtering the middle two input samples of four gamma-modified input signal samples. Decimation filtering is applied to the four input samples using preferably a {xc2xc, xc2xd, xc2xc} symmetric 3-tap FIR filter. Because each output value requires three input samples and there are four input samples, only the two middle samples are filtered. Decimation by factors greater than two can be accomplished by multiple passes of decimation-by-two with a final pass of decimation by close to two. Because a final pass may use a decimation factor between one and two, a number of selectable filter banks that range from decimation-by-one to decimation-by-two are provided in memory. If the desired filter bank is not available, the final decimation pass uses the closest available decimation filter. However, this may result in noticeable filter switching during dynamic zooming. Generally, having more selectable filter banks results in less noticeable filter switching.
Stage three involves finding the co-sited gradients. Gradients are calculated using the input samples, so stage two and stage three can be executed in parallel. The gradients are calculated using digital differentiating filters. For image reduction (i.e., down-sampling), a simple pair of differentiating filters is used for gradient estimation. For image enlargement (i.e., up-sampling), an asymmetric FIR differentiating filter is advantageously used for gradient estimation in accordance with the present invention. This filter results in better image resizing than conventional 3-tap symmetric filters often used for gradient estimation. The asymmetric FIR filter emphasizes accurate edge handling over accurate peak handling, resulting in improved zone-plate test signals and a sharper image appearance in general. Zone plates are video test signals in which the spatial frequencies are a smooth function of {x,y} position.
Stage four involves calculating the cubic polynomial coefficients using the two middle input samples from stage two and the two co-sited gradients calculated in stage three.
Stage five involves calculating the re-sampled output sample value. A piece-wise cubic model is preferably generated to obtain a piece-wise continuous model of the output signal. The model is then evaluated at the desired fractional position to obtain a re-sampled value.
Stage six involves gamma modification to reapply gamma correction. The compromise gamma correction applied in stage one is removed and the initial gamma correction is reapplied to produce the resulting resized image. The initial gamma needs to be re-applied in order to display the resulting image correctly on a non-linear device, such as a monitor. Reapplying the original gamma correction to the resized image means that the system does not change the presentation of flat regions in an image (static component), but affects the position of edges (dynamic component) so that narrow bright and dark regions are both preserved from the gamma correction applied in stage one.