Threads (i.e., an abstract construct of an instance of a program executing on a processor) have a basic guarantee of forward progress. In other words, if one thread becomes blocked (e.g., due to resources being unavailable or the inability to acquire a semaphore), then other threads continue to make forward progress until the thread becomes unblocked. The other threads will continue execution unless the other threads are also dependent on unavailable resources. The guarantee of forward progress is necessary to support patterns extremely common in procedural parallel programming, such as locks. In single processors such as conventional CPUs, threads are typically guaranteed forward progress by allocating each thread a number of cycles of the processing unit of the processor in a serialized or round-robin fashion.
Unfortunately, threads executing in a parallel processing architecture, such as architectures common to today's graphics processing units (GPUs), may be executed concurrently and are not typically independent of other concurrently executing threads. When a particular thread becomes blocked, a number of other concurrently executing threads may also become blocked as a result of current divergence mechanisms implemented in parallel processing architectures. Consequently, parallel threads that implement locks or critical sections of code may deadlock unpredictably, thereby failing to ensure forward progress of the threads. Thus, there is a need for addressing this issue and/or other issues associated with the prior art.