1. Field of the Invention
Embodiments of the present invention relate generally to graphics applications and more specifically to a method and system for using bundle decoders in a processing pipeline.
2. Description of the Related Art
Unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
A context switch is a feature of a multitasking operating system that allows for a switch in execution from one computing thread or process to another. This feature ensures that a processor cannot be monopolized by any one processor-intensive thread or process. During a context switch, the states of the processor of the currently running process are stored in memory, and the processor is switched to states of another process that were previously stored in memory.
In graphics applications, a number of threads may be processed concurrently through one or more graphics pipelines that are managed by a graphics processing unit (“GPU”). FIG. 1 is a simplified block diagram of processing pipeline 100 that includes pipeline units 108-1, 108-2, and 108-3 (collectively referred to as pipeline units 108). FIG. 1 also shows front end (“FE”) 102, which manages the context switch operation for processing pipeline 100 by sending information via “bundles” to the various pipeline units. A “bundle” is a data structure that contains a header, which indicates the intended destination for the bundle, and payload, which contains information such as state information or trigger information for a pipeline unit. To illustrate, suppose FE 102 sends three versions of a bundle, B0, at three different times, time 1, time 2, and time 3. The version at time 1, also denoted as B0(time 1), contains state A; B0(time 2) contains state B; and B0(time 3) contains state C. Thus, as the three versions of B0 flow down processing pipeline 100, it is possible that at time 3, B0(time 1) has reached pipeline unit 108-3; B0(time 2) has reached pipeline unit 108-2 but has not reached pipeline unit 108-3; and B0(time 3) has reached pipeline unit 108-1 but has not reached pipeline unit 108-2. In this scenario, pipeline units 108-1, 108-2, and 108-3 have states C, B, A, respectively. In other words, as the different versions of bundle B0 flow down processing pipeline 100, state information that is previously stored in pipeline units 108 is rewritten with the state information stored in these different versions of B0.
According to the wait-for-idle (“WFI”) protocol, when FE 102 receives a context switch command, FE 102 suspends sending commands down processing pipeline 100 and then waits for an idle status signal from each of pipeline units 108. A context switch occurs only after FE 102 receives these idle status signals. During this idle period, all the bundles in flight in processing pipeline 100 are completely drained. Using the example discussed above, all three versions of B0 are drained by reaching pipeline unit 108-3. As a result, each of pipeline units 108 has state C. To proceed with the context switch, rather than retrieving and storing state C from each of pipeline units 108, FE 102 maintains a shadow copy of the last state that it encapsulates in a bundle and sends down processing pipeline 100 in a memory region reserved for the context associated with the currently running process. In this example, the last state is state C. Then, FE 102 switches processing pipeline 100 to the context associated with another process after that context is retrieved from a memory region reserved for that context. Each of these reserved memory regions resides in memory 106 and is accessed through memory interface 104.
As shown above, the aforementioned WFI protocol does not provide FE 102 with the flexibility to proceed with a context switch operation when there are bundles in flight in processing pipeline 100. Using the example above, FE 102 cannot switch the context of processing pipeline 100 at time 3 in accordance with the WFI protocol, because at time 3 pipeline units 108-1, 108-2, and 108-3 do not yet have the same state information. In addition, current implementations of processing pipeline 100 fail to impose uniformity on the formats and processing of the bundles. Again using the example discussed above, this lack of uniformity may result in FE 102 not recognizing and therefore not utilizing B0 after the bundle flows down processing pipeline 100 and is operated on by various pipeline units 108. Another drawback of the current approach to context switching is that using shadow copies to track the information needed for context switch operations is costly due to the additional storage space and computational overhead necessary to maintain and manage the shadow copies.
As the foregoing illustrates, what is needed is a way to intelligently manage the bundles in a processing pipeline to improve the efficiency of switching the context of the processing pipeline and thereby enhancing the overall performance of the processing pipeline.