Generating accurate depictions of complex scenes in interesting lighting environments is one of the primary goals in computer graphics. The general solution to this problem requires the solution of an integral equation that is difficult to solve even in non-interactive settings. In interactive graphics, short cuts are generally made by making simplifying assumptions of several properties of the scene; the materials are generally assumed to be simple, the lighting environment is either approximated with a small number of point and directional lights, or environment maps and transport complexity (i.e., how the light bounces around the scene, such as, inter-reflections, caustics and shadows) is only modeled in a limited way. For example, shadows may be computed for dynamic point lights, but not for environment maps.
Pre-Computed Radiance Transfer
Sloan et al., “Graphics Image Rendering With Radiance Self-Transfer For Low-Frequency Lighting Environments”, U.S. patent application Ser. No. 10/389,553, filed Mar. 14, 2003, and published as Publication No. US-2003-0179197-A1 (the disclosure of which is hereby incorporated by reference) [hereafter “Sloan '553 application”], describes a technique called “precomputed radiance transfer” (PRT) that enables rigid objects to be illuminated in low frequency lighting environments with global effects like soft shadows and inter-reflections in real time. It achieves these results by running a lengthy pre-process that computes how light is transferred from a source environment to exit radiance at a point. Previous methods for running this pre-process were designed to execute on the central processing unit (CPU) of a computer, and utilized the processing and memory resources of the CPU in a traditional manner.
Graphics Processing Unit
Computers commonly have a graphics adapter or graphics accelerator that contains a specialized microprocessor, generally known as a graphics co-processor or graphics processing unit (GPU). A GPU also can be integrated into the chip set contained on the motherboard of the computer. The GPU handles high-speed graphics-related processing to free the computer's central processing unit (CPU) for other tasks. Today's graphics adapters (e.g., various graphics adapter models available from NVIDIA and ATI Technologies, among others) feature GPUs that are specifically designed to render 3-dimensional (3D) graphics images and video at high frame rates, such as for use in computer games, video and other graphics intensive applications. Some CPUs for computers also include specialized instructions in their instruction sets that are designed for graphics-related processing (e.g., the MMX instructions in Intel Corporation microprocessors).
In past graphics adapters, the GPU generally provided fixed functionality for graphics processing operations. Application programs (e.g., a game with 3D graphics) would interact with the graphics adapter through a graphics application programming interface (API), such as Microsoft Corporation's DirectX®, and OpenGL® of Silicon Graphics, Inc. Through a graphics API, the application programs directed the GPU to execute its fixed graphics processing functions.
In its version 8, Microsoft DirectX® more recently introduced the concept of a programmable graphics shader for recent programmable graphics hardware. A shader is a program that executes on graphics hardware to perform graphics processing on a per pixel, or per vertex (or other graphics component or fragment) basis for 3D graphics rendering. DirectX® 8 included a language for writing shaders. DirectX® version 9 further introduced a high level programming language for shaders (called the High Level Shading Language or HLSL) to make it easier for application developers to create shaders. The HLSL is syntactically similar to the well known C programming language. This makes it easier for programmers who are familiar with C to understand.
The architecture or design of current GPUs is optimized to execute particular graphics operations, but has limitations that prevent practical and efficient execution of others. The PRT preprocessing technique as previously implemented on the CPU includes operations that do not execute efficiently on the GPU. Accordingly, a direct mapping of the previous PRT preprocessing technique from the CPU onto GPU hardware would not be practical or efficient. For example, current GPUs do not practically or efficiently execute reduction operations, ray tracing, rasterization, as well as providing slow read-back paths, among others. In particular, the computations in the previous CPU-based PRT preprocessing technique that involve iterating over vertices and shooting rays do not map well to current GPU hardware.