1. Field of the Invention
Embodiments of the present invention relate generally to single-instruction, multiple-data (SIMD) processing and, more particularly, to a system and method for processing thread groups in a SIMD processor.
2. Description of the Related Art
A SIMD processor associates a single instruction with multiple data paths to allow the hardware to efficiently execute data-parallel algorithms. The usual benefits of a SIMD processor implementation results from the reduction in pipeline control hardware and instruction processing that comes from running multiple data paths in lockstep.
In general, increasing the number of data paths in a SIMD processor will allow more data to be processed in parallel and will lead to performance improvements. Processor size constraints, however, limit the number of data paths beyond a certain number. Also, if the number of data paths is too large, there may be under-utilization of hardware resources.