1. Field of the Invention
The present invention relates to digital signal processing, and more specifically to a method and device for scaling an image from one resolution to another.
2. Description of Related Art
Image scaling resizes a source image having one resolution to produce a destination image having another resolution. In general, the source image is scaled by using a discrete geometric transform to map the pixels of the destination image to pixels of the source image. The destination image is traversed and a transformation function is used to calculate which pixels in the source image are to be used to generate each destination pixel. Because destination pixels are not typically aligned with the source pixels, an interpolation function is used to generate a value for a destination pixel by weighting the surrounding source pixels. Several common interpolation functions can be used based on the specific application. While the more sophisticated interpolation algorithms generate higher quality images, their complexity requires more processing time or hardware to generate the destination image.
Nearest neighbor interpolation is a simple algorithm in which fractional destination pixel locations are simply rounded so as to assign the closest source pixel to the destination image. While this algorithm is fast, the destination image quality can be poor and appear jagged. Bilinear interpolation produces higher quality images by weighting the values of the four pixels nearest a fractional destination pixel location. Each weight is inversely proportional to the distance of the corresponding source pixel from the fractional destination pixel location. Bilinear interpolation produces a smoother destination image, but requires more processing time because three linear interpolations must be computed for each of the destination pixels.
While the nearest neighbor algorithm uses one source pixel and the bilinear algorithm uses four source pixels to generate each destination pixel, higher order interpolation functions produce high quality images by using greater numbers of source pixels and more complex interpolation functions. The interpolation function is centered at a specific point of the source image and used to weight the nearby pixels. For example, the cubic convolution algorithm uses the sixteen nearest source pixels and the following one-dimensional cubic function, which is shown in FIG. 1(a), to calculate the value of each destination pixel.
      f    ⁡          (      x      )        =      {                                                                      (                                  a                  +                  2                                )                            ⁢                                                                  x                                                  3                                      -                                          (                                  a                  +                  3                                )                            ⁢                                                                  x                                                  2                                      +            1                                                0            ≤                                        x                                      <            1                                                                          a              ⁢                                                                  x                                                  3                                      -                          5              ⁢              a              ⁢                                                                  x                                                  2                                      +                          8              ⁢              a              ⁢                                              x                                                      -                          4              ⁢              a                                                            1            ≤                                        x                                      <            2                                                0                                      2            ≤                                        x                                                        where a is typically between −0.5 and −2.0. The destination pixel values must be clipped whenever the result is less than zero or greater than the maximum pixel value.
The cubic convolution function produces a sharpened image due to the presence of negative side lobe values. On the other hand, the B-spline algorithm produces a smoothed image using the sixteen nearest source pixels and the following one-dimensional B-spline function, which is shown in FIG. 1(b).
      f    ⁡          (      x      )        =      {                                                                      (                                  1                  /                  2                                )                            ⁢                                                                  x                                                  3                                      -                                                          x                                            2                        -                          (                              2                /                3                            )                                                            0            ≤                                        x                                      <            1                                                                                          -                                  (                                      1                    /                    6                                    )                                            ⁢                                                                  x                                                  3                                      +                                                          x                                            2                        -                          2              ⁢                                              x                                                      +                          (                              4                /                3                            )                                                            1            ≤                                        x                                      <            2                                                0                                      2            ≤                                        x                                                        Clipping is not required when using the B-spline function because it is only positive and the sum of the sample points is always 1. A more detailed explanation of conventional scaling using linear transformation algorithms can be found in R. Crane, “A Simplified Approach to Image Processing,” Prentice Hall, New Jersey (1997), which is herein incorporated by reference.
As explained above, conventional image scaling algorithms are based on the application of a linear kernel function that weights the contribution of source pixels to each destination pixel. The weights are chosen based on the location of the theoretical destination sampling point relative to the actual source pixels so as to combine the source pixels in a manner that best represents the source content at the resolution of the destination image. In the classic signal processing sense, the continuous analog input is decimated by the conversion to a digital image and an interpolation filter function is used to re-sample the signal. Mathematically, the operation is a two-dimensional linear convolution. More specifically, a two-dimensional scaling filter calculates a dot product of the source pixel values with a weighting vector that is computed using a predetermined filtering function.
Currently, the scaler engines used for image scaling in video graphics applications employ conventional linear transform algorithms (such as those described above) and are primarily differentiated by the size of the convolution kernel. The interpolation algorithm to be used in a specific engine is determined based on the competing considerations of output image quality and hardware costs. The hardware that is needed to practically implement an interpolation algorithm depends on factors such as the filter weight resolution and the number of filter taps, which are dependent on the convolution kernel used for the interpolation function.
For example, the simple filtering kernel used to implement the nearest neighbor algorithm is restricted to have only a single nonzero weight. Because no multiplication or addition is required, a simple structure can be used to perform convolution with this filter function. However, to achieve better image quality, non-binary weights must be used. This necessitates the use of multipliers to perform the convolution. Furthermore, video graphics scalar engines typically operate on raster scanned information in which horizontal lines of pixels are serially processed. If the interpolation algorithm requires information from a pixel in a line other than the current line, the video information must be delayed by a line buffer memory (e.g., RAM). Image quality generally improves with more filter taps.
While hardware costs can limit the choice to certain interpolation algorithms, the specific algorithm that is used by a scalar engine is preferably chosen based on the content presented by the application. For example, one algorithm may be optimal for one type of content such as live video, while another algorithm of similar complexity is optimal for another type of content such as computer graphics. Although the interpolation algorithm can be chosen based on the image content, conventional scalar engines use a single convolution kernel for scaling the entire image. Therefore, if different types of content are present in the image, the overall quality of the scaled image is suboptimal.