I. Field
The present disclosure relates generally to circuits, and more specifically to a graphics processor.
II. Background
Graphics processors are widely used to render 2-dimensional (2-D) and 3-dimensional (3-D) images for various applications such as video games, graphics, computer-aided design (CAD), simulation and visualization tools, imaging, etc. A graphics processor may perform various graphics operations to render an image. One such graphics operation is convolution filtering, which is commonly used in image processing, 3-D post processing, 2-D imaging operations, etc. Convolution filtering may be used to obtain effects such as edge sharpening, blurring, noise reduction, etc. Convolution filtering may also be used for scaling, rotation, texture mapping, etc.
For convolution filtering, an H×W grid of picture elements (pixels) is multiplied element-by-element with an H×W grid of convolution coefficients, where H is the height and W is the width of each grid. H·W intermediate results from the element-by-element multiplies are accumulated to obtain a final result for one pixel position. The same convolution computation may be repeated for many (e.g., all) pixel positions in an image. The convolution computation for one pixel position requires H·W multiply and accumulate operations. Hence, a large number of arithmetic operations may be performed for convolution filtering of the image.
Some high-end graphics processors utilize dedicated hardware to handle the large number of arithmetic operations for convolution filtering. The dedicated hardware may be cost prohibitive for many applications. Furthermore, the dedicated hardware is typically designed for a specific grid size and may not efficiently handle convolution filtering of other grid sizes.