1. Field of the Invention
The present invention relates generally to parallel processing and, more particularly, to ordered execution for parallel processing devices.
2. Related Art
Processing units are capable of executing processes or threads without regard to the order in which the processes or threads are dispatched. The out of order execution of processes or threads gives the processing units the ability to better utilize the latency hiding resources, to increase their efficiency, and to improve their power and bandwidth consumption.
However, in some cases, it is preferred that some processes or threads be executed in order. The processes or threads that require ordered operation/execution can include processes or threads for accessing memory or any other forms of processes or threads. One example where the execution of ordered processes or threads is preferred is when the processes or threads are writing data in an ordered buffer memory, however, the amount of data that each process, thread, or the like (hereinafter referred to as process for convenience, but not limitation) is writing is not fixed. In order to correctly execute these processes or threads, a particular process needs to make sure that all of the processes or threads that were supposed to write their data in the memory before this particular process have done so before this particular process can be executed.
Ordered execution of processes or threads can be performed using memory polling. In this method, every process polls the memory at every given location. A process runs if a value in the memory corresponds to its identification. However, memory polling is a power and memory intensive operation because it requires reading the memory over and over again and there is no guarantee if or when the process will run.