1. Field of the Invention
The present invention relates to the processing of video images. More particularly, the present invention relates to an apparatus and method for reducing the number of hardware multipliers and simplifying the addressing logic for video processing operations.
2. Background
Computer and video displays are made up of a series of frames that are sequentially displayed on a monitor. While one frame is being displayed, the next frame is assembled and stored in a memory called the frame buffer. The frequency of changing the displayed frames, i.e. the frame cycle, is controlled by a clock known as a dot clock.
Each frame is made up of a number of pixels and each pixel may be composed of several color space components. To assemble a frame of pixels, the frame buffer must be refilled within each frame cycle. Therefore, when processing digital signals for real-time display, the value of every color space component in each pixel comprising a frame must be obtained within a frame cycle. To this end, one pixel is added to the frame buffer in each cycle of the dot clock. For example, in a 1024 by 768 display there will be 786,432 dot clock cycles per frame cycle.
Scaling involves transforming an input frame to an output frame by changing the effective display resolution, and therefore changing the number of pixels from the input frame to the output frame. Superior output image quality is best obtained by using an algorithm which uses interpolation to calculate the value of the pixels in the output frame. This is typically accomplished in a structure called a pixel filter, which combines data from several pixels of the input frame into each pixel of the output frame.
The basic finite impulse response (FIR) algorithm is:
Out(j)=Sum [Coeff(i)*In(i,j)], i=0,1 . . . nxe2x88x921
where Coeff(i) is the filter tap coefficient corresponding to filter tap i, and n is the number of filter taps. The algorithm is used to calculate each color space component of each pixel in the output frame.
In a typical image processing operation, the value of the pixels in an output frame are calculated in a pixel-oriented approach. For example, if a three-tap filter is used, the pixel filter will require the parallel input of the three pixels from the input frame corresponding to the three filter taps. Then, the filter will use the FIR algorithm to calculate the value of the pixel for the output frame in a single cycle of the filter clock which controls the pixel filter. If there are multiple color space components in each pixel, all of the color space components for each pixel in an output frame will be processed in parallel.
The speed of the pixel filter is controlled by a filter clock. In the pixel-oriented approach, the filter clock runs at the same speed as the dot clock. Therefore, a pixel for the output frame is produced by the pixel filter at each cycle of the filter clock, and a (different) pixel is added to the output frame buffer at each cycle of the dot clock.
The pixel-oriented approach requires sufficient hardware to produce a pixel in each cycle of the filter clock. Generally, this means that x times n multipliers are required, where x equals the number of color space components per pixel and n equals the number of filter taps in the filter. Since the most expensive hardware (in terms of gate count) required by pixel filters are multipliers, it would be desirable to reduce the number of hardware multipliers needed for a given scaling operation.
The pixel-oriented approach requires that all scaling coefficients required by the FIR algorithm be provided at each cycle of the filter clock. Therefore, it would also be desirable to simplify the addressing logic for the provision of scaling coefficients during a scaling operation.
A method and apparatus for block-oriented pixel filtering reduces the number of hardware multipliers required for an image processing operation by increasing the speed of the pixel filter and rearranging the math operations. A sorter is employed in the line buffers so that defined groups of input pixel components are provided to the multipliers of the pixel filter. An accumulator is employed to receive products from the multipliers and assemble output pixels. The savings in gate count from reducing the number of multipliers is greater than additional costs, if any, of the sorter and other logic. The method and apparatus of the invention also simplify the addressing logic for the provision of scaling coefficients during an image processing operation.