1. Field of the Invention
The present invention generally relates to streaming multiprocessor thread scheduling and more specifically to scheduling instructions for threads in groups to process the threads uniformly.
2. Description of the Related Art
Conventional multithreaded processors increase the size of a cache as needed to reduce the number of cache misses in order to achieve a desired performance level. Various techniques may be used to reduce the number of cache misses. As different threads execute the same program, some threads may advance ahead of other threads, reducing the locality of cache accesses and increasing the number of cache misses.
Accordingly, what is needed in the art is a system and method for improving cache locality and system performance for streaming multiprocessors.