As those skilled in the pertinent art are aware, applications may be executed in parallel to increase their performance. Data parallel applications carry out the same process concurrently on different data. Task parallel applications carry out different processes concurrently on the same data. Static parallel applications are applications having a degree of parallelism that can be determined before they execute. In contrast, the parallelism achievable by dynamic parallel applications can only be determined as they are executing. Whether the application is data or task parallel, or static or dynamic parallel, it may be executed in a pipeline which is often the case for graphics applications.
A SIMT processor is particularly adept at executing data parallel applications. A pipeline control unit in the SIMT processor creates groups of threads of execution and schedules them for execution, during which all threads in the group execute the same instruction concurrently. In one particular processor, each group has 32 threads, corresponding to 32 execution pipelines, or lanes, in the SIMT processor.
Parallel applications typically contain regions of sequential code and parallel code. Sequential code cannot be executed in parallel and so is executed in a single thread. When parallel code is encountered, the pipeline control unit splits execution, creating groups of worker threads for parallel execution of the parallel code. When sequential code is again encountered, the pipeline control unit joins the results of the parallel execution, creates another single thread for the sequential code, and execution proceeds.
It is important to synchronize the threads in a group. Synchronizing in part involves conforming the states of local memories associated with each lane. It has been found that synchronizing can be made faster if, while executing sequential code, a counterpart thread of the sequential code is executed in each of the lanes. The local memory states are thus assumed to be already conformed if execution is later split.