1. Field
The present invention relates generally to graphics processors and more specifically to thread scheduling of threads having dependent instructions.
2. Background
Specialized processors are often used to perform specific functions related to a type of application in order to efficiently and quickly perform operations related to the application. For example, graphic processors perform various graphics operations to process image data and render an image and are efficient at manipulating and displaying computer graphics. Due to their highly-parallel structure, graphics processors are more effective than typical general processors for a range of complex algorithms. A graphics processor implements a number of graphics primitive operations in a way that makes executing the operations much faster than presenting the graphics directly to a screen with the host central processing unit (CPU).
In order to efficiently utilize resources in a specialized processor such as a graphics processor, tasks are often organized into threads where execution of the threads can be performed simultaneously or pseudo-simultaneously. Since a resource typically can address only a single instruction at a given time, a thread scheduler is used to control the timing of the execution of the instructions of the various threads and efficiently allocate resources to the threads. Some instructions, however, require the retrieval of data from a data source having an unpredictable latency. For example, retrieval of data from some memories within a processor system may have an unpredictable latency due to data size or location. Another example of a data source with an unpredictable latency is a texture engine that returns texture data in a graphics processor. Due to the variable complexity of texture instruction, the time required to return the texture data can not be predicted. An instruction of a thread may or may not require the retrieval of data from a data source with unpredictable latency. During the execution of a thread, a dependent instruction may require data acquired by a previous instruction to execute the dependent instruction. Where the required data is acquired from an unpredictable latency data source, the required data may not be returned in time to execute the dependent instruction.
One technique for managing threads in conventional system includes checking data availability before executing every instruction of a particular thread. Such methods, however, require complicated detection schemes that utilize resources. Another technique used in conventional systems includes placing instructions on hold until a load instruction is completed which results in low efficiency.
Therefore, there is a need for a thread scheduler for efficiently managing the execution of threads having dependent instructions.