1. Field of the Invention
The present invention generally relates to graphics processing and more specifically to a system and method for managing divergent threads in a single-instruction, multiple-data (“SIMD”) architecture.
2. Description of the Related Art
Current graphics data processing includes systems and methods developed to perform specific operations on graphics data such as, for example, linear interpolation, tessellation, rasterization, texture mapping, depth testing, etc. Traditionally, graphics processors used fixed function computational units to process graphics data; however, more recently, portions of graphics processors have been made programmable, enabling such processors to support a wider variety of operations for processing vertex and fragment data.
To further increase performance, graphics processors typically implement processing techniques such as pipelining that attempt to process in parallel as much graphics data as possible throughout the different parts of the graphics pipeline. Graphics processors with SIMD architectures are designed to maximize the amount of parallel processing in the graphics pipeline. In a SIMD architecture, the various threads attempt to execute program instructions synchronously as often as possible to increase processing efficiency.
A problem typically arises, however, when the program includes branches, and some threads want to execute the branch, but others do not. Threads that don't want to execute a branch are disabled for the branch. In some prior art systems, even when all threads want to execute the same side of a conditional branch, the instructions associated with each side of the conditional branch are executed. Given that these systems may execute upwards of 800 threads, such a design is quite inefficient since hundreds of threads may be needlessly dragged through a branch.
Accordingly, what is needed in the art is a more efficient branching algorithm for systems with SIMD architectures.