In recent years, graphical processing units (GPUs) that are in general pipelined have substantially increased in complexity. As a result, the increase in complexity has lead to an increase in difficulty in determining the performance characteristics of the GPU. Accordingly, achieving optimal performance out of a GPU has become a daunting task due to the growing complexity in design and an increase in difficulty of determining its performance characteristics. Since a pipeline GPU runs only as fast as its slowest processing unit, it is important to identify and address the slower stages of the pipeline in order to improve the GPU efficiency and improve its frame rate.
In general, details of an application being executed on a GPU and its performance are hidden from the user in order to protect the manufacturers' proprietary information. On the other hand, providing detailed information of an application executing on the GPU and its performance allows software developers to improve the efficiency of the program. Accordingly, there is a tradeoff between protecting the proprietary information of the manufacturer of the GPU and improving the performance of the GPU and its corresponding frame rate.
In order to identify performance of a GPU a targeted experimentation may be performed. Targeted experimentation includes increasing and decreasing the workload at different processing units within the GPU pipeline. However, varying the workload at different processing units not only impacts the particular unit under test but it also impacts other processing units within the GPU pipeline. In other words, since most of the processing units of the GPU are interdependent, varying the workload of one processing unit impacts not only the performance of that unit but other units as well. Accordingly, impacting one processing unit ripples through the entire GPU pipeline.
As such, it is a difficult task to isolate a particular processing unit within the GPU pipeline by merely varying the workload. Moreover, there is a tradeoff between exposing internal GPU information/performance data and improving the frame rate.