This disclosure relates generally to the field of graphics processing on a graphics processing unit (GPU) and GPU workloads executed on a GPU (e.g., compute processing). More particularly, but not by way of limitation, this disclosure relates to an interactive visual debugger/profiler for a graphics processing unit (GPU), having multiple interactive panes to display information captured from a GPU workload and optionally a GPU execution trace buffer (e.g., to provide additional information for a “frame capture” tool capturing a GPU workload). The interactive visual debugger/profiler, referred to as a Resource Dependency Viewer (or simply “Dependency Viewer”), represents an improvement to the art of GPU processing code testing and development by providing information to an application developer to assist in GPU processing code implementation and refinement (e.g., optimization process at the functional level).
Computers, mobile devices, and other computing systems typically have at least one programmable processor, such as a central processing unit (CPU) and other programmable processors specialized for performing certain processes or functions (e.g., graphics processing). Examples of a programmable processor specialized to perform graphics processing operations include, but are not limited to, a GPU, a digital signal processor (DSP), a field programmable gate array (FPGA), and/or a CPU emulating a GPU. GPUs, in particular, comprise multiple execution cores (also referred to as shader cores) designed to execute the same instruction on parallel data streams, making them more effective than general-purpose processors for operations that process large blocks of data in parallel. For instance, a CPU functions as a host and hands-off specialized parallel tasks to the GPUs. Specifically, a CPU can execute an application stored in system memory that includes graphics data associated with a video frame. Rather than processing the graphics data, the CPU forwards the graphics data to the GPU for processing; thereby, freeing the CPU to perform other tasks concurrently with the GPU's processing of the graphics data.
GPU processing, such as render-to-texture passes may be implemented by a series of encoders. Encoders may utilize outputs from previous encoders and other graphical parameters (e.g., textures) as “resources” to perform their execution. Accordingly, GPU processing includes a series of functions that execute in an execution flow (sometimes referred to as a “graphics pipeline”) to produce a result to be displayed. Encoders often write and read data from one or more memory caches to improve performance and power saving. For instance, a render-to-texture pass encoder renders a frame to a texture resource that can be later re-passed to a shader encoder for further processing. By doing so, the GPU could be writing to and/or reading from the texture resource before the GPU is done utilizing the texture resource. The highly parallel nature of GPU processing may make it difficult for an application developer (working at the source code level) to understand exactly how the GPU is processing their source code. For example, the application developer may not know the exact order of processing performed by a GPU for a given source code input and may not know exactly how encoders and resources have been “chained” together to produce a graphical result. Thus, even though an application may be presenting accurate results, it may not be performing processing that is fully optimized. Having visibility into how a GPU actually processes encoders and utilizes resources associated with those encoders could allow an application developer to improve the source code and thereby improve GPU performance of a particular application (e.g., by altering the source code of that application). Accordingly, disclosed implementations of the dependency viewer represent an improvement to the art of graphical code implementation because the application developer may be provided information to address possible “unseen” performance issues.