The invention relates generally to the field of computer graphics processing and, more particularly, to an improved means for generating a Gaussian blur. The subject matter of the invention is generally related to the following jointly owned and co-pending patent applications: “Improved Blur Computation Algorithm” by Mark Zimmer, Ser. No. 10/826,596; and “System for Optimizing Graphics Operations” by John Harper, Ralph Brunner, Peter Graffagnino, and Mark Zimmer, Ser. No. 10/825,694, each incorporated herein by reference in its entirety.
In the object-oriented programming context of most modern graphics processing systems, there are generally five types of objects available to a programmer: images; filters; contexts; vectors; and textures. An image is generally either the two dimensional result of a rendering operation (a bitmap or raster image) or a vector representation of the same. A filter is generally a collection of one or more high-level functions that are used to affect images. A context is a space, such as a defined place in memory where the result of a filtering operation is stored. A vector is a collection of floating point numbers, for example, the four dimensional vector used to describe the appearance of a pixel (red, blue, green and transparency levels). A texture is a representation or description of an object's surface and may describe properties such as, for example, the surface's smoothness, coarseness, regularity, color, brightness and transparency. Each of these definitions is somewhat exemplary in nature, and the foregoing definitions should not be considered exclusive or otherwise overly restrictive.
Most relevant to the purposes of the present invention are images and filters. A relatively common filter applied to images is a blur. Blur filtering is used to generate shadows, depict cinematic motion, defocus an image, sharpen an image, render clean line art, detect edges and many other professional photographic effects. Well-known blur filters include, but are not limited to, the: Gaussian blur (simulates shooting a subject with an out-of-focus lens); box blur (changes the color value of each pixel based on the pixels next to it in the vertical and horizontal directions to quickly create a blur effect); channel blur (used to produce a blur in one or more individual image channels—i.e., red, green, blue and transparency channels); dolly blur (creates blurs that increase radially outwards from a defined center point); roll blur (simulate the blur created when a camera or object is spun on its own axis); and motion blurs (simulate the blur created by fast-moving objects).
In practice, the Gaussian blur provides the most realistic (highest quality) and visually pleasing blur effect. For these and other reasons, Gaussian blurs are among the most popular image processing operations used. Unfortunately, implementation of Gaussian blurs in a conventional manner are computationally intensive operations, requiring approximately 2 w multiply-adds per pixel, where “w” represents the radius of the blur. To avoid this computational cost, it is common to use repeated box or IIR (infinite impulse response) blurs—both of which are computationally less expensive.
Many modern computer systems include dedicated graphics hardware—programmable graphics processing units (“GPUs”). One type of GPU program, referred to as fragment programs, allow programmers to directly compute an image by specifying the program that computes a single pixel of that image. This program is run in parallel, operating on many pixels at once, by the GPU to produce the result image. Because multiple pixels are processed at a single time by dedicated hardware, GPUs can provide dramatically improved image processing capability (e.g., speed) over methods that relied on a computer system's central processing unit (“CPU”) which is also responsible for performing other system and application duties.
Because Gaussian blurs form the cornerstone of many image processing algorithms, it has become important to compute them efficiently. As noted above, one means of generating a Gaussian-like blur is to cascade a series of box blur operations. Unfortunately, cascading box blurs cannot be efficiently implemented by GPUs because such operations require the ability to sum values across a number of rows and/or columns—current GPU architectures do not inherently support such operations and, as a result, are inefficient to implement. Thus, it would be beneficial to provide a mechanism to efficiently approximate Gaussian blurs using GPU hardware.