1. Field of the Invention
The present invention relates generally to digital video resizing or video image scaling technology and, more particularly, to techniques for reducing the cost and complexity of video scaling with minimal loss of perceived image quality.
2. Description of the Related Art
Video image scaling converts a digital or digitized image from one spatial resolution to another. For example, a digital image with a spatial resolution of 720 horizontal by 480 vertical pixels may have to be converted to another resolution in order to be displayed on a particular display device. Converting this image to, for example, a LCD panel with a fixed resolution of 640xc3x97480 requires horizontal scaling by the ratio 640/720, which is equivalent to 8/9.
This is an example of downscaling because the ratio is a fraction that is less than 1. Down-scaling creates fewer output samples than originally present in a given input. In contrast, scaling the same output to a panel of 800xc3x97600 would require horizontal scaling by 10/9 (800/720) and vertical scaling by the ratio 5/4 (600/480). These cases are examples of upscaling because the ratio is a fraction greater than 1.
Video scaling is type of digital sample rate conversion. A known technique of accomplishing video scaling is the multirate FIR (Finite Impulse Response) digital filter that achieves high quality sample rate conversion. However, this type of processing is computationally costly because it requires several multiplications and additions per output sample. When a real time processing requirement is added, the scaling function can consume a large amount of hardware resources and make it difficult to achieve high quality sample rate conversion at low cost.
FIR filters are one of two main classes of digital filters, the other being the well known IIR (Infinite Impulse Response) digital filter. Images may be thought of as signals, and like other complex signals, images typically are made up of many frequencies. High frequencies correspond to fine detail or sharpness and low frequencies correspond to smoothly or slowly changing image features.
FIG. 1 is a diagram illustrating a FIR filter 10 of the prior art. FIR filters 10 have a useful property, known as linear phase, which means that the delay through the filter is the same for all frequencies. Unequal delay results in distortion in the image, which is why FIR filters 10 are widely used in image processing applications. Linear phase results from symmetry of the filter""s coefficients.
The FIR filter 10 includes shift register 12 with a series of data registers 14, each of which is connected to a clock 16. Each data register 14 is connected by one of a series of filter taps 20 to one of a series of multipliers 18. The multipliers 18 are connected to an adder 22. Data is input into the FIR filter 10 through the shift register 12. The output of each data register 14 is coupled by one of the series of filter taps 20 to one of a set of multipliers 18 to be multiplied by a unique coefficient C0-C7. The results from each multiplier 18 are then summed by the adder 22 to produce a filtered output sample.
The number of adjacent data samples input into the FIR filter 10 is equal to the number of filter taps 20 used and is application dependent. In general, higher performance requires a larger number of adjacent samples and therefore a larger number of filter taps 20. The multipliers 18 have coefficient symmetry because the coefficients on the left half mirror those on the right half, i.e. C3=C4; C2=C5; C1=C6; C0=C7.
As shown, FIR filter 10 has an even number of coefficients, but FIR filters may have either even or odd numbers of coefficients. FIR filter 10 may be used to implement many types of frequency responses, such as low-pass, high-pass, bandpass, etc. The type of response is determined by the number of coefficients and by the method used to calculate the coefficients. The design task for developing a low-pass FIR filter 10 is to determine the number of filter taps 20, which is performance and application dependent, determine its cutoff frequency, and then to calculate the filter""s coefficients.
There are many ways to compute the filter""s coefficients. One method is known as the Windowing method. For the application of processing 8-bit component digital video in a high quality consumer product, computing the coefficients with a Hamming window is an acceptable method. Given the number of taps and the filter""s cutoff frequency, computing the coefficients for an even number of coefficients using the window method and a Hamming window can be done with the following Equation 1:                     ∑                  i          =          1                m            ⁢              xe2x80x83            ⁢              c        ⁢                  xe2x80x83                ⁢                  (          i          )                      =                            2          ⁢          fc                          2          ⁢          fc          ⁢                      xe2x80x83                    ⁢          π          ⁢                      xe2x80x83                    ⁢                      (                          i              -                              1                /                2                                      )                              *              sin        ⁡                  [                      2            ⁢            fc            ⁢                          xe2x80x83                        ⁢            π            ⁢                          xe2x80x83                        ⁢                          (                              i                -                                  1                  /                  2                                            )                                ]                    *              {                  0.54          +                      0.46            ⁢                          xe2x80x83                        ⁢                          cos              ⁡                              [                                  2                  ⁢                                      xe2x80x83                                    ⁢                  π                  ⁢                                      xe2x80x83                                    ⁢                                                            (                                              i                        -                                                  1                          /                          2                                                                    )                                        /                    taps                                                  ]                                                    }                  m    =                  taps        /        2            ⁢              xe2x80x83            ⁢              (                  m          ⁢                      xe2x80x83                    ⁢          unique          ⁢                      xe2x80x83                    ⁢          coefficients          ⁢                      xe2x80x83                    ⁢          result          ⁢                      xe2x80x83                    ⁢          from          ⁢                      xe2x80x83                    ⁢          the          ⁢                      xe2x80x83                    ⁢                      filter            '                    ⁢                      xe2x80x83                    ⁢          s          ⁢                      xe2x80x83                    ⁢          symmetry                )                  i    =          iteration      ⁢              xe2x80x83            ⁢      variable            fc    =          normalized      ⁢              xe2x80x83            ⁢      cutoff      ⁢              xe2x80x83            ⁢      frequency      ⁢              xe2x80x83            ⁢                        (                      cutoff            ⁢                          xe2x80x83                        ⁢            frquency                    )                /                  (                      sampling            ⁢                          xe2x80x83                        ⁢            frequency                    )                    ⁢              xe2x80x83            ⁢      ranging      ⁢              xe2x80x83            ⁢      from      ⁢              xe2x80x83            ⁢      0      ⁢              xe2x80x83            ⁢      to      ⁢              xe2x80x83            ⁢      0.5      ⁢              xe2x80x83            ⁢      Hz      
Scaling up by an integer (L) can be done directly with the FIR filter 10. Scaling down by 1/M (M is an integer) can also be done directly with the FIR filter 10. Video scaling typically requires scaling by a ratio of integers L/M. Scaling by a ratio is known as multirate filtering. Conceptually, it can be viewed as first upscaling by L then downscaling by M as shown in a method 24 in FIG. 2. First, a video stream 26 is input into the FIR filter 10. The FIR filter 10 then upscales by integer L as indicated at 28 to produce a data output by FIR filter 10 at a rate=fin*L as indicated at 30.
Next, the FIR filter 10 downscales by integer 1/M in act 32. This causes the video to be output at a rate=fin*L/M as shown at 34. The FIR filter 10 accomplishes downscaling by limiting the frequency content of the input stream to less than the cutoff frequency using the low pass FIR filter 10, then simply taking every Mth sample and discarding the rest. After determining the number of taps, the downscaling filter""s nominal normalized cutoff frequency is:   fc  =            1              2        ⁢        M              .  
Upscaling is more complicated. First the data stream is padded out with Lxe2x88x921 zero values between each input sample as shown in the example below. For L=3, if a, b, c, d, e represent a series of input data samples, the zero inserted stream becomes: a, 0, 0, b, 0, 0, c, 0, 0, d, 0, 0, e, 0, 0 . . . . This stream becomes the input to the FIR filter which is operating at a clock rate of L*fin. The padding out of zeros introduces a new frequency into the data stream, i.e. normalized introduced frequency   =            1              2        ⁢        L              .  
So the frequency content of the new zero padded stream consists of the original data stream plus the new frequency 1/(2L), which will always be the highest frequency in the zero padded stream. The job of the FIR filter 10 is to remove the 1/(2L) frequency and distribute the energy of the non-zero samples over all the output samples. The cutoff frequency then becomes fc=1/(2L). In addition, the energy level of the input stream must be raised by L times (because of the averaging with zero that occurs in the filter). The result is that each coefficient in Equation 1 must be multiplied by L so the coefficient calculation gives us Equation 2:             ∑              i        =        1            m        ⁢          xe2x80x83        ⁢          c      ⁢              xe2x80x83            ⁢              (        i        )              =                    2        ⁢        Lfc                    2        ⁢        fc        ⁢                  xe2x80x83                ⁢        π        ⁢                  xe2x80x83                ⁢                  (                      i            -                          1              /              2                                )                      *          sin      ⁡              [                  2          ⁢                      xe2x80x83                    ⁢          fc          ⁢                      xe2x80x83                    ⁢          π          ⁢                      xe2x80x83                    ⁢                      (                          i              -                              1                /                2                                      )                          ]              *          {              0.54        +                  0.46          ⁢                      xe2x80x83                    ⁢                      cos            ⁡                          [                              2                ⁢                                  xe2x80x83                                ⁢                π                ⁢                                  xe2x80x83                                ⁢                                                      (                                          i                      -                                              1                        /                        2                                                              )                                    /                  taps                                            ]                                          }      
As a practical matter, padding out a video data stream with zeros is difficult because of the large number of pixels produced at the output of the upscaling FIR filter 10. This is especially true for video processing. For example, if the input data rate to a filter is 13.5 million samples/sec, and the scaling ratio is 8/9, then the output of the upscaling FIR filter 10 is 13.5M*8=108 million samples/sec, so a real-life implementation becomes costly. Most of those samples would be discarded in the downsampling stage where 1/M=1/9 to reduce the data rate to 12 million samples/sec. Fortunately, there are techniques for converting directly from the 13.5 Ms/sec to 12 Ms/sec without the intermediate stage.
Multirate filtering is also referred to as polyphase filtering. In standard FIR filters, coefficients are fixed, but in polyphase filters, the coefficients change every time a new data set is input. For example, see Multirate Digital Signal Processing by Ronald Crochiere and Lawrence Rabiner, Section 3.3.4: xe2x80x9cFIR Structures with Time Varying Coefficients for Interpolation/Decimation by a Factor of L/M.
If the number of filter taps is chosen so that number of taps=L*mults where L is the numerator of the scaling ratio L/M, and mults is a number of multiplies, then the multirate problem becomes much simpler. For example, suppose the number of multipliers chosen by the filter designer is 6 and the scaling ratio L/M=3/4. Then the number of taps in the FIR filter 10 would be taps=L*mults=3*6=18.
Suppose a, b, c, d, e represents a data stream which is padded out with Lxe2x88x921 or 2 zeros between each input sample: a, 0, 0, b, 0, 0, c, 0, 0, d, 0, 0, e, 0, 0 . . . . Also, suppose the 18 coefficients are numbered c0 through c17 and the zero padded data is shifted through the filter. Each line represents the data shifting through the filter from right to left on each clock cycle as shown in FIG. 3.
Considering that the zero values will produce a zero output at the multiplier, it is clear that only the coefficients that have real data values a, b, d, e . . . actually need to be computed. In addition, FIG. 3 shows upscaling by 3, to downscale by 4, only every 4th output is taken. FIG. 4 is organization of the coefficients into 3 repeating sets of 6 coefficients per set. To further simplify the scaler, it is only necessary to compute the samples marked OUTPUT in FIG. 4.
The process of the prior art for scaling is as follows is first to determine the scaling ratio L/M. Then the number of multiplies is decided. The number is application dependent. In general, more multiplies improves quality but adds cost. Excellent quality has been achieved with six multiplies in consumer video applications. Next, the number of filter taps 20: taps=L*mults and the filter""s nominal cutoff frequency fc is computed. If L/M less than 1, then fc=1/(2M). If L/M greater than 1, then fc=1/(2L). The FIR filter 10 coefficients are computed using Equation 2, which are then organized into L sets of mult coefficients per set. Finally, the output pixels are computed.
The technique produces high quality results, but can be costly and complex to implement in a low-cost real-time hardware processor because hardware multipliers are expensive and bulky, with both size and cost being dependent on the number of bits used to quantize the filter""s coefficients. For example, an 8xc3x978 multiplier is twice as large as an 8xc3x974 multiplier. The prior art requires 8-12 bits of precision for the filter coefficients. The prior art also requires 6 or more multiplies for each of the 3 video components Cb, Cr, Y, i.e. 18 multiplies per output pixel. If both horizontal and vertical scaling is done, then the number of multiplies required is doubled.
For a hardware implementation, the coefficients must be quantized to some number of bits, the number is application dependent, but for a high quality video application, 8 bits are the minimum, 10 bits are better. The number of coefficients bits correlate directly to the cost of the hardware multipliers.
In view of the foregoing, it is desirable to have a method that provides for quantizing filter coefficients to a reduced number of bits in video scaling of digital images in order to lower the cost of the process and decrease the bulk of the chip without noticeably degrading the image quality.
The present invention fills these needs by providing an efficient and economical method and apparatus for video scaling. It should be appreciated that the present invention can be implemented in numerous ways, including as a process, an apparatus, a system, a device or a method. Several inventive embodiments of the present invention are described below.
In one embodiment of the present invention, a digital image processor is provided. The digital image processor includes a shift register having a number of serially connected registers. The shift register is receptive to an image data word signal and has a plurality of taps. A coefficient store provides a number of quantized coefficients in which the number of coefficients stored corresponds to an integer multiple of the taps. A number of multipliers are provided, each having a first input coupled to a tap of the shift register and having a second input coupled to the coefficient store to receive a coefficient to provide a number of multiplied outputs. An adder is coupled to the multiplied outputs, wherein the adder generates a filtered and scaled image data output signal.
In another embodiment of the present invention, a method of processing a digital image is provided. The method includes inputting image data into a shift register to form a set of data words. The data words are then multiplied with a quantized coefficient produced by a coefficient generator to produce a series of multiplied outputs, where the number of quantized coefficients corresponds to an integer multiple of a number of taps. The series of multiplied outputs are then added to generate a filtered and scaled image data output.
In yet another embodiment of the present invention, a method for developing FIR coefficients is provided. The method includes developing a number of coefficients for low pass filter with desired parameters. The coefficients are then organized into L sets of coefficients, where each set includes a number M of elements corresponding to an integer multiple of a number of taps. The L sets of coefficients are then processed and stored into a coefficient store.
An advantage of the present invention is that it provides for a hardware scheme and coefficient generator that allows for variable scaling of digital images by using reduced bits of precision (e.g. 4 bits of precision) as opposed to 8-12 bits of precision required by the prior art. Because both the cost and the size of the multiplier is proportional to the number of bits multiplied, the present invention is able to reduce the cost of variable scaling as well as reducing the size of the chip.