A debugger or debugging tool is a computer program that may be used to test and debug other programs. The code to be examined might be running on an instruction set simulator, a technique that may allow greater control in its ability to halt when specific conditions are encountered, but which will typically be somewhat slower than executing the code directly on the appropriate or the same processor. Some debuggers offer two modes of operation, full or partial simulation, to limit this impact.
A program may include one or more bugs that cause the program to execute improperly (e.g., causing the program to behave undesirably, provide incorrect results, crash entirely, etc.). A debugger may monitor characteristics of a program while the program executes and provide diagnostic information to a user in order to investigate the cause or symptoms of a bug in the program. For example, a debugger may indicate the different values of a memory location as a result of instructions and operations that may cause the value of the memory location to change.
Conventional debuggers show data read and written, to and from memory or registers, in order to help software developers better understand how the computer is executing their programs. Debuggers may generally present such data live, or as the program executes. A developer has the option to step through the source code, instruction by instruction, per thread of execution. The developer sees only a snapshot of the current state of the computer (e.g., values in memory, location in the program, and the active thread). Aside from the program stack trace, all context must be tracked manually by the developer.
Processors (e.g., Graphics Processing Units (GPUs), Central Processing Units (CPUs), etc) may process many programs, instructions, threads, and so on in parallel. Many logical contexts may be executing in parallel or otherwise interleaved. For example, modern GPU's execute programs simultaneously on several independent streaming multiprocessors (SM). Each SM is capable of simultaneously executing multiple cooperative thread arrays (CTAs), each warp of which may include multiple threads (e.g., 16 threads, 32 threads, 64 threads, etc). CTAs, warps, and/or threads can have interdependence on other threads and warps on the SM, and the order and interleaving of instruction execution from multiple other threads can be critical in understanding execution errors in programs running on an SM. Conventional processes of executing simulations of program(s) and debugging contexts of execution become more complicated with the parallel context information.
For example, an SM may execute an instruction of a first program. However, before executing a second instruction of the first program, the SM may execute one or more instructions of one or more other programs (e.g., 100 other instructions, 1,000 other instructions, and so on). The SM may eventually return to executing one or more instructions of the first program. Further, the SM may execute threads, warps, CTAs, and/or programs interleaved and/or in a multi-threaded fashion. A debugger may have difficulty following the execution of the first program because the debugging data associated with one or more instructions of one or more other programs may be interleaved with the debugging data of the first program.