Users often apply a series of filters to an image or video to enhance the look of the image or video. These filters can include effects like vignettes, fisheye or other distortions, sepia or other color changes, blurs, adding overlays, etc. These filters can be implemented independently one after the other in various orders by taking an image as an input and generating an output image using the filter. Each filter can be implemented as its own graphics processing unit (GPU) operation. Combining multiple filters can have interesting and desirable results, such as applying a sharpening filter followed by a color enhancement, followed by a vignette. In addition, on some systems, each filter implemented as a GPU operation is often referred to as a GPU kernel. The GPU operations or kernels are implemented as a fragment shader that runs on every pixel of the destination image. The geometry rendered by the fragment shader is a single quad (two triangles) the size of the entire source image.
The GPU kernels generally identify the matching pixel from the source image via the texture mapping, mutate it, and determine the destination color. They can apply coordinate transformations by modifying the coordinates that are read from the texture. They can apply color transformations by modifying the color after reading it. They can also do more complex things like read multiple source pixels to apply a blur.
Furthermore, in order to combine these GPU kernels into one merged filter or a pipeline of filters, multiple passes are used to ping pong between two textures of the source image. In the ping pong technique, each filter is compiled into a single program that can execute on the GPU. An intermediate image is created that is only used for processing the multiple filters. A first filter is applied to the input image (source image) and output to the intermediate image. The second filter is then applied to the intermediate image and output to the output image. This process is repeated until all filters are completed. There are two major drawbacks for this technique. The first drawback is that it requires extra memory to store the intermediate texture of the source image, and the second drawback is that there is a lot of overhead in performing multiple passes and the performance can suffer.
The final output image is usually produced by putting together or composting the intermediate images generated by rendering more than one “layer” or “pass.” To produce an image with a black and white blur with a fisheye distortion, while the effect chain described produces the correct result using multiple passes, it requires an extra full-size buffer and 3 full size render passes. This uses unnecessary GPU memory, strains the fill and texture rates of the GPU, and requires clearing the GPU pipeline between each pass to switch an output to an input. This is a particular problem for mobile devices that have limited computational power. The intended effect could be written as a single GPU kernel using a single pass, but it would be very special purpose. Implementing every potential combination of algorithms into its own GPU kernel is not practical because of the large quantity of potential combinations and the maintenance burden.