Taking and displaying of still and video images are becoming ubiquitous. Consider, for example, the recent explosion in the popularity of cameras in cell phones and image/video displays in portable music players, for example.
Resolutions of image/video content that are available in the market place today vary quite dramatically. For example, digital still cameras are capable of producing images whose resolutions currently can vary from as small as 320×240 pixels to as large as 3072×2048 pixels.
Meanwhile, typical display units like a TV monitor or a LCD display in hand-held devices are only capable of displaying images with much smaller resolutions. For example, TV monitors generally support images at NTSC or PAL resolution (720×280 or 720×576, respectively) while most low-cost LCD display units support images up to 320×240 pixels. Obviously, the input images are significantly greater than the resolutions typically supported by the display units.
There are two known ways to handle this mismatch. In the first, the display rectangle can be overlaid on top of the larger image. The user can then interactively move the cursor to view different regions of the larger image (i.e. panning). The second way is to scale the entire input image to fit to the dimensions of the display window (i.e. scaling). Both schemes have advantages and disadvantages and therefore it is a general practice to support both schemes. For example, in the first scheme, only the portion of the picture is visible to the viewer at any time, while in the second scheme, the image has to be modified to fit the viewing area.
For the second scheme, a two-dimensional sample-rate converter (i.e. scaler) is typically used. Accordingly, as shown in FIG. 1, an image converter 100 receives an input image and up-scales or down-scales the image to produce an output display image in accordance with the resolution of the input image and the resolution of the display. Since the input resolutions of the images can change from image to image, the up-sampling/down-sampling ratio (the ratio of the output dimension to the input dimension) can also change from image to image, even if the image display resolution remains the same.
In addition, the spatial relationship between input and output samples also changes, on a sample-by-sample (i.e. pixel-by-pixel) basis. For example, assume that there is a need to down-sample an image by a factor of 5/16. In other words, for each row and column of sixteen input samples, five output sample will need to be created. In such an example, as shown in FIG. 2, pixels a and b are input samples in an input image row 202 having sixteen input samples and pixels A and B are output samples in an output image row 204 that need to be calculated. Depending on the spatial location of the output pixels with respect to the input pixels, the formula to calculate the output pixel value can change.
More particularly, poly-phase filters are typically used to compute the output pixel values. In such a technique, the filter that is used for each output sample is selected dynamically from a predetermined set of filters depending on the sampling ratio that needs to be achieved and the spatial relationship or phase between the output sample and the input sample(s). Accordingly, with reference to FIG. 2, the filter that is applied to the input samples to compute output pixel A will likely be different from the filter that is applied to the input samples to compute pixel B, given the different spatial relationship or phase between input pixels a and b, respectively (i.e. phasea≠phaseb). This approach, when applied on a sample-per-sample basis as is conventionally done, consumes considerable computation resources, and thus makes implementing such techniques in any time critical application quite challenging.
An example of a conventional horizontal scaling process using poly-phase filters will now be described in more detail in connection with FIG. 3, and the example down-sampling ratio illustrated in FIG. 2.
As shown in FIG. 3, processing begins in step S302 by determining the index of the first/next output sample to compute. Referring to FIG. 2, the first output sample to be computed corresponds to sample A, which has an index of 1. Next, in step S304, the index of the corresponding input sample is determined based on the scaling factor. With reference to the example of FIG. 2, this is obtained by trunc (output index*Δout/Δin)=trunc (1*16/5)=trunc (3.2)=3.
Processing advances to step S306, where the filter to apply to the input sample(s) including and adjacent to input index 3 (i.e. pixel a) is selected. This is typically determined based on the spatial relationship between the given input sample and output sample. For example, a set of filters is predetermined and indexed in order depending on the increasing size of spatial “error” or phase between the output sample and the input sample. Accordingly, referring to FIG. 2, the index of the filter to use for determining output sample A will be lower than the index of the filter to use for determining output sample B, given that output sample A is spatially closer to corresponding input sample a than output sample B is relative to corresponding input sample b (i.e. phasea<phaseb). More specifically, for output sample A, the phase or error with respect to the corresponding input sample a is determined as rem (output index*Δout/Δin)=rem (1*16/5)=rem (3.2)=0.2, and the index of the filter to select is determined by round (error*# phases). In an example where a set of 8 poly-phase filters are used, the index of the filter to select is round (0.2*8)=round (1.6)=2. Alternatively, the filter index can be determined by rem (round (# phases*(output index*Δout/Δin))/# phases)=rem ((round (8*(1*16/5)))/8)=rem (round (25.6)/8)=rem (26/8)=2.
In step S308, the selected filter is applied to the input sample(s) to calculate the output sample. Depending on the number of taps in the filter, the corresponding number of input samples adjacent to and including the determined input sample will be retrieved and used to calculate the output sample.
In step S310, it is determined whether the end of the row of output samples has been reached. If not, processing returns to step S302. Otherwise processing advances to step S312, where it is determined if the last row of output samples has been calculated. If so, horizontal scaling processing ends. Otherwise processing advances to step S314, where the output index is reset to the beginning of the next row before returning to step S304.
It should be noted that vertical scaling can be performed in similar fashion as described in connection with FIGS. 2 and 3 above. It should be further noted that the conventional approach to scaling, when applied on a sample-per-sample basis as is conventionally done, consumes considerable computation resources, and thus makes implementing such techniques in any time critical application quite challenging.
In addition to the afore-mentioned problems with conventional scaling techniques, the pixel data to the scaler unit typically arrives as two-dimensional blocks of varying sizes. For example, the scaler unit may be part of a processing pipeline including a JPEG or MPEG decoder. The output of a JPEG decompression module may include block sizes of 8×8, 16×16, 16×8 and 8×16 (an MPEG decoder block output is typically 16×16). As a result, most current systems will separate the JPEG/MPEG decoding process from the scaling process.
More particularly, as shown in FIG. 4, since the amount of SRAM is limited in most embedded processors, the image will first be decoded by a JPEG decoder and stored in external memory and then will be read again from memory for the scaling process. Further, since the scaling is a two-dimensional process, typically, because of memory constraints, the images will first be horizontally scaled on a pixel-line basis, stored back to external memory, and then the horizontally-scaled pixel lines will be read from memory, and vertically scaled to obtain the desired resolution. This involves multiple reads and writes to the external memory.
Accordingly, it would be desirable if a sample-rate conversion technique could efficiently convert images without the computation, memory and bandwidth requirements of conventional techniques that are applied in standard pipelines such as a JPEG or MPEG decoding pipeline.