The present invention relates in general to management of state information in a processor, and in particular to management of multiple versions of state information.
Parallel processing techniques enhance throughput of a processor or multiprocessor system when multiple independent computations need to be performed. A computation can be divided into tasks that are defined by programs, with each task being performed as a separate thread. (As used herein, a “thread” refers generally to an instance of execution of a particular program using particular input data, and a “program” refers generally to a sequence of executable instructions that produces result data from input data.) Parallel threads are executed simultaneously using different processing engines inside the processor.
As is generally known, many programs also rely on “state information” to control or determine various aspects of their behavior. State information typically includes various parameters that are supplied to the program at execution time, allowing the parameters to be readily modified from one instance to the next of program execution. For example, in the context of computer-based image rendering, shader programs are well known. Many shader programs include instructions for applying one or more textures to a surface using particular algorithms. If the texture(s) to be applied is (are) defined within the program itself, then changing the texture(s) would require recompiling the program. Thus, shader programs typically use a “texture index” parameter to identify each texture. The state information associated with the shader program includes a “binding,” or association, of each texture index parameter to actual texture data.
In multithreaded processors, it is desirable to allow different threads that execute the same program to use different versions of the state information for that program. To the extent that different threads are limited to using the same version of the state information, the ability of the processor to run threads in parallel may be limited. In some instances, each time the state information is to be updated, the processor would need to wait for all threads that use a current version of the state information to finish before launching any new threads that use the updated state information. This can lead to idle time in the processor.
Some multithreaded processors avoid such idle time by providing a separate set of state registers for each thread. Where the number of concurrent threads and the amount of state information required per thread are relatively small, this approach is practical; however, as the number of concurrent threads and/or the amount of state information to be stored per thread becomes larger, providing a sufficiently large register space becomes an expensive proposition.
Further, the amount of state information required per thread can vary. For instance, different shader programs may define different numbers of texture bindings. If the state register is made large enough to accommodate a separate version of the maximum amount of state information for every thread, much of this space may be wasted in cases where the maximum amount of information is not being stored.
It would therefore be desirable to provide more flexible techniques for managing multiple versions of state information.