The need to improve the efficiency of graphics processing units (GPUs) running graphical applications has always been a concern of software developers. For example, developers often look to increase frame rate by reducing the execution time of graphical operations of an application. However, optimizing GPU performance is a daunting task given the limited number of performance tools available, and the limited number of features that the conventional tools offer.
In an attempt to reduce graphical operation execution time, conventional performance tools may indicate a graphical operation with the largest execution time. Using this information, a developer may then modify the operation of pipeline units of the GPU to more efficiently process the graphical operation by reducing the execution time. The developer may then tackle the graphical operation with the next highest execution time, and so on until the application performs at an acceptable frame rate.
Although this approach seems logical, developers generally find it time-consuming, tedious, without appreciable performance increases, and problematic. For example, conventional performance tools using this methodology require a developer to spend inordinate amounts of time locating troublesome operations and then optimizing the GPU pipeline for each troublesome graphical operation. Additionally, many tweaks are often required, which increases the probability that bottlenecks and/or underutilization of pipeline units may occur. And even if performance increases are realized, compromises made to increase performance at the cost of the rendered image quality may leave the graphical application in an unacceptable state.